summaryrefslogtreecommitdiffstats
path: root/tools/encoding/big5.txt
diff options
context:
space:
mode:
authorjan.nijtmans <nijtmans@users.sourceforge.net>2021-06-28 14:22:00 (GMT)
committerjan.nijtmans <nijtmans@users.sourceforge.net>2021-06-28 14:22:00 (GMT)
commit35315e5c2eb5d97a339bc2ed7882ba59092fcbb4 (patch)
treeab7a694a3fcded0b0e61f43e459841c97960b303 /tools/encoding/big5.txt
parentad4b60ef7cdd42e625b1df07eaa29129cdc7a157 (diff)
downloadtcl-35315e5c2eb5d97a339bc2ed7882ba59092fcbb4.zip
tcl-35315e5c2eb5d97a339bc2ed7882ba59092fcbb4.tar.gz
tcl-35315e5c2eb5d97a339bc2ed7882ba59092fcbb4.tar.bz2
Update many tools/encoding/*.txt files to the latest version, but leave out character changes. Only add the new "cns11643" encoding, which belongs to the same group as big5
Diffstat (limited to 'tools/encoding/big5.txt')
-rw-r--r--tools/encoding/big5.txt109
1 files changed, 51 insertions, 58 deletions
diff --git a/tools/encoding/big5.txt b/tools/encoding/big5.txt
index 5cc9e81..58cdfe2 100644
--- a/tools/encoding/big5.txt
+++ b/tools/encoding/big5.txt
@@ -1,47 +1,28 @@
-# big5.txt --
-#
-# BIG5 to Unicode table (modified)
-#
-# Copyright (c) 1998-1999 by Scriptics Corporation.
-#
-# See the file "license.terms" for information on usage and redistribution
-# of this file, and for a DISCLAIMER OF ALL WARRANTIES.
-#
-# NOTE: this table has been modified to include the 7-bit ASCII
-# characters that are allowed in BIG5 files.
-#
+# BIG5.TXT
+# Date: 2015-12-02 23:52:00 GMT [KW]
+# © 2015 Unicode®, Inc.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
#
# Name: BIG5 to Unicode table (complete)
# Unicode version: 1.1
-# Table version: 0.0d3
+# Table version: 2.0
# Table format: Format A
-# Date: 11 February 1994
-# Authors: Glenn Adams <glenn@metis.com>
-# John H. Jenkins <John_Jenkins@taligent.com>
-#
-# Copyright (c) 1991-1994 Unicode, Inc. All Rights reserved.
-#
-# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
-# No claims are made as to fitness for any particular purpose. No
-# warranties of any kind are expressed or implied. The recipient
-# agrees to determine applicability of information provided. If this
-# file has been provided on magnetic media by Unicode, Inc., the sole
-# remedy for any claim will be exchange of defective media within 90
-# days of receipt.
-#
-# Recipient is granted the right to make copies in any form for
-# internal distribution and to freely use the information supplied
-# in the creation of products supporting Unicode. Unicode, Inc.
-# specifically excludes the right to re-distribute this file directly
-# to third parties or other organizations whether for profit or not.
+# Date: 2011 October 14 (header updated: 2015 December 02)
#
# General notes:
#
-# This table contains the data Metis and Taligent currently have on how
-# BIG5 characters map into Unicode.
+# NOTE: this table has been modified to include the 7-bit ASCII
+# characters that are allowed in BIG5 files.
+#
+# This table contains one set of mappings from BIG5 into Unicode.
+# Note that these data are *possible* mappings only and may not be the
+# same as those used by actual products, nor may they be the best suited
+# for all uses. For more information on the mappings between various code
+# pages incorporating the repertoire of BIG5 and Unicode, consult the
+# VENDORS mapping data.
#
# WARNING! It is currently impossible to provide round-trip compatibility
-# between BIG5 and Unicode.
+# between BIG5 and Unicode.
#
# A number of characters are not currently mapped because
# of conflicts with other mappings. They are as follows:
@@ -58,44 +39,56 @@
#
# We currently map all of these characters to U+FFFD REPLACEMENT CHARACTER.
# It is also possible to map these characters to their duplicates, or to
-# the user zone.
-#
+# the user zone.
+#
# Notes:
#
# 1. In addition to the above, there is some uncertainty about the
# mappings in the range C6A1 - C8FE, and F9DD - F9FE. The ETEN
-# version of BIG5 organizes the former range differently, and adds
-# additional characters in the latter range. The correct mappings
-# these ranges need to be determined.
+# version of BIG5 organizes the former range differently, and adds
+# additional characters in the latter range. The correct mappings
+# these ranges need to be determined.
#
# 2. There is an uncertainty in the mapping of the Big Five character
-# 0xA3BC. This character occurs within the Big Five block of tone marks
-# for bopomofo and is intended to be the tone mark for the first tone in
-# Mandarin Chinese. We have selected the mapping U+02C9 MODIFIER LETTER
-# MACRON (Mandarin Chinese first tone) to reflect this semantic.
-# However, because bopomofo uses the absense of a tone mark to indicate
-# the first Mandarin tone, most implementations of Big Five represent
-# this character with a blank space, and so a mapping such as U+2003 EM SPACE
-# might be preferred.
-#
-#
+# 0xA3BC. This character occurs within the Big Five block of tone marks
+# for bopomofo and is intended to be the tone mark for the first tone in
+# Mandarin Chinese. We have selected the mapping U+02C9 MODIFIER LETTER
+# MACRON (Mandarin Chinese first tone) to reflect this semantic.
+# However, because bopomofo uses the absense of a tone mark to indicate
+# the first Mandarin tone, most implementations of Big Five represent
+# this character with a blank space, and so a mapping such as U+2003 EM
+# SPACE might be preferred.
#
# Format: Three tab-separated columns
# Column #1 is the BIG5 code (in hex as 0xXXXX)
# Column #2 is the Unicode (in hex as 0xXXXX)
# Column #3 is the Unicode name (follows a comment sign, '#')
-# The official names for Unicode characters U+4E00
-# to U+9FA5, inclusive, is "CJK UNIFIED IDEOGRAPH-XXXX",
-# where XXXX is the code point. Including all these
-# names in this file increases its size substantially
-# and needlessly. The token "<CJK>" is used for the
-# name of these characters. If necessary, it can be
-# expanded algorithmically by a parser or editor.
+# The official names for Unicode characters U+4E00
+# to U+9FA5, inclusive, is "CJK UNIFIED IDEOGRAPH-XXXX",
+# where XXXX is the code point. Including all these
+# names in this file increases its size substantially
+# and needlessly. The token "<CJK>" is used for the
+# name of these characters. If necessary, it can be
+# expanded algorithmically by a parser or editor.
#
# The entries are in BIG5 order
#
-# Any comments or problems, contact <John_Jenkins@taligent.com>
+# Revision History:
+#
+# [v2.0, 2015 December 02]
+# updates to copyright notice and terms of use
+# no changes to character mappings
+#
+# [v1.0, 2011 October 14]
+# Updated terms of use to current wording.
+# Updated contact information.
+# No changes to the mapping data.
+#
+# [v0.0d3, 11 February 1994]
+# First release.
#
+# Use the Unicode reporting form <http://www.unicode.org/reporting.html>
+# for any questions or comments or to report errors in the data.
#
0x20 0x0020 # SPACE
0x21 0x0021 # EXCLAMATION MARK