summaryrefslogtreecommitdiffstats
path: root/tools/encoding/macThai.txt
blob: b991833429f284ca95ab91b276cae8859391e6aa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
#
#   Name:             MacOS_Thai [to Unicode]
#   Unicode versions: 1.1, 2.0
#   Table version:    0.2 (from internal ufrm version <6>)
#   Date:             17 November 1995
#   Authors:          Peter Edberg <edberg1@applelink.apple.com>
#                     Judy Kettenhofen
#
#   Copyright (c) 1995 Apple Computer, Inc.  All Rights reserved.
#
#   Apple, the Apple logo, and Macintosh are trademarks of Apple
#   Computer, Inc., registered in the United States and other countries.
#   Unicode is a trademark of Unicode Inc. For the sake of brevity,
#   throughout this document, "Macintosh" can be used to refer to
#   Macintosh computers and "Unicode" can be used to refer to the
#   Unicode standard.
#
#   Apple makes no warranty or representation, either express or
#   implied, with respect to these tables, their quality, accuracy, or
#   fitness for a particular purpose. In no event will Apple be liable
#   for direct, indirect, special, incidental, or consequential damages
#   resulting from any defect or inaccuracy in this document or the
#   accompanying tables.
#
#   These mapping tables and character lists are preliminary and
#   subject to change. Updated tables will be available from the
#   Unicode Inc. ftp site (unicode.org), the Apple Computer ftp site
#   (ftp.info.apple.com), the Apple Computer World-Wide Web pages
#   (http://www.info.apple.com), and possibly on diskette from APDA
#   (Apple's mail-order distribution service for developers).
#
#   Format:
#   -------
#
#   Three tab-separated columns;
#   '#' begins a comment which continues to the end of the line.
#     Column #1 is the MacOS Thai code (in hex as 0xNN)
#     Column #2 is the Unicode (in hex as 0xNNNN)
#     Column #3 is the Unicode name (follows a comment sign, '#')
#
#   The entries are in MacOS Thai code order.
#
#   Notes on MacOS Thai:
#   --------------------
#
#   Codes 0xA1-0xDA and 0xDF-0xFB are the character set from Thai
#   standard TIS 620-2533, except that the following changes are
#   made:
#     0xEE is TRADE MARK SIGN (instead of THAI CHARACTER YAMAKKAN)
#     0xFA is REGISTERED SIGN (instead of THAI CHARACTER ANGKHANKHU)
#     0xFB is COPYRIGHT SIGN (instead of THAI CHARACTER KHOMUT)
#
#   Codes 0x80-0x82, 0x8D-0x8E, 0x91, 0x9D-0x9E, and 0xDB-0xDE are
#   various additional punctuation marks (e.g. curly quotes,
#   ellipsis), no-break space, and two special characters "word join"
#   and "word break".
#
#   Codes 0x83-0x8C, 0x8F, and 0x92-0x9C are for positional variants
#   of the upper vowels, tone marks, and other signs at 0xD1,
#   0xD4-0xD7, and 0xE7-0xED. The positional variants would normally
#   be considered presentation forms only and not characters. In most
#   cases they are not typed directly; they are selected automatically
#   at display time by the WorldScript software. However, using the
#   Thai-DTP keyboard, the presentation forms can in fact be typed
#   directly using dead keys. Thus they must be treated as real
#   characters in the MacOS Thai encoding, and  must be mapped to
#   distinct Unicodes - corporate-zone characters in this case - to
#   enable roundtrip mapping.
#
#   Several code points are undefined and unused (they cannot be
#   typed using any of the Mac OS Thai keyboard layouts): 0x90, 0x9F,
#   0xFC-0xFE. In the table below these are mapped to 0xFFFD.
#
##################

0x20	0x0020	# SPACE
0x21	0x0021	# EXCLAMATION MARK
0x22	0x0022	# QUOTATION MARK
0x23	0x0023	# NUMBER SIGN
0x24	0x0024	# DOLLAR SIGN
0x25	0x0025	# PERCENT SIGN
0x26	0x0026	# AMPERSAND
0x27	0x0027	# APOSTROPHE
0x28	0x0028	# LEFT PARENTHESIS
0x29	0x0029	# RIGHT PARENTHESIS
0x2A	0x002A	# ASTERISK
0x2B	0x002B	# PLUS SIGN
0x2C	0x002C	# COMMA
0x2D	0x002D	# HYPHEN-MINUS
0x2E	0x002E	# FULL STOP
0x2F	0x002F	# SOLIDUS
0x30	0x0030	# DIGIT ZERO
0x31	0x0031	# DIGIT ONE
0x32	0x0032	# DIGIT TWO
0x33	0x0033	# DIGIT THREE
0x34	0x0034	# DIGIT FOUR
0x35	0x0035	# DIGIT FIVE
0x36	0x0036	# DIGIT SIX
0x37	0x0037	# DIGIT SEVEN
0x38	0x0038	# DIGIT EIGHT
0x39	0x0039	# DIGIT NINE
0x3A	0x003A	# COLON
0x3B	0x003B	# SEMICOLON
0x3C	0x003C	# LESS-THAN SIGN
0x3D	0x003D	# EQUALS SIGN
0x3E	0x003E	# GREATER-THAN SIGN
0x3F	0x003F	# QUESTION MARK
0x40	0x0040	# COMMERCIAL AT
0x41	0x0041	# LATIN CAPITAL LETTER A
0x42	0x0042	# LATIN CAPITAL LETTER B
0x43	0x0043	# LATIN CAPITAL LETTER C
0x44	0x0044	# LATIN CAPITAL LETTER D
0x45	0x0045	# LATIN CAPITAL LETTER E
0x46	0x0046	# LATIN CAPITAL LETTER F
0x47	0x0047	# LATIN CAPITAL LETTER G
0x48	0x0048	# LATIN CAPITAL LETTER H
0x49	0x0049	# LATIN CAPITAL LETTER I
0x4A	0x004A	# LATIN CAPITAL LETTER J
0x4B	0x004B	# LATIN CAPITAL LETTER K
0x4C	0x004C	# LATIN CAPITAL LETTER L
0x4D	0x004D	# LATIN CAPITAL LETTER M
0x4E	0x004E	# LATIN CAPITAL LETTER N
0x4F	0x004F	# LATIN CAPITAL LETTER O
0x50	0x0050	# LATIN CAPITAL LETTER P
0x51	0x0051	# LATIN CAPITAL LETTER Q
0x52	0x0052	# LATIN CAPITAL LETTER R
0x53	0x0053	# LATIN CAPITAL LETTER S
0x54	0x0054	# LATIN CAPITAL LETTER T
0x55	0x0055	# LATIN CAPITAL LETTER U
0x56	0x0056	# LATIN CAPITAL LETTER V
0x57	0x0057	# LATIN CAPITAL LETTER W
0x58	0x0058	# LATIN CAPITAL LETTER X
0x59	0x0059	# LATIN CAPITAL LETTER Y
0x5A	0x005A	# LATIN CAPITAL LETTER Z
0x5B	0x005B	# LEFT SQUARE BRACKET
0x5C	0x005C	# REVERSE SOLIDUS
0x5D	0x005D	# RIGHT SQUARE BRACKET
0x5E	0x005E	# CIRCUMFLEX ACCENT
0x5F	0x005F	# LOW LINE
0x60	0x0060	# GRAVE ACCENT
0x61	0x0061	# LATIN SMALL LETTER A
0x62	0x0062	# LATIN SMALL LETTER B
0x63	0x0063	# LATIN SMALL LETTER C
0x64	0x0064	# LATIN SMALL LETTER D
0x65	0x0065	# LATIN SMALL LETTER E
0x66	0x0066	# LATIN SMALL LETTER F
0x67	0x0067	# LATIN SMALL LETTER G
0x68	0x0068	# LATIN SMALL LETTER H
0x69	0x0069	# LATIN SMALL LETTER I
0x6A	0x006A	# LATIN SMALL LETTER J
0x6B	0x006B	# LATIN SMALL LETTER K
0x6C	0x006C	# LATIN SMALL LETTER L
0x6D	0x006D	# LATIN SMALL LETTER M
0x6E	0x006E	# LATIN SMALL LETTER N
0x6F	0x006F	# LATIN SMALL LETTER O
0x70	0x0070	# LATIN SMALL LETTER P
0x71	0x0071	# LATIN SMALL LETTER Q
0x72	0x0072	# LATIN SMALL LETTER R
0x73	0x0073	# LATIN SMALL LETTER S
0x74	0x0074	# LATIN SMALL LETTER T
0x75	0x0075	# LATIN SMALL LETTER U
0x76	0x0076	# LATIN SMALL LETTER V
0x77	0x0077	# LATIN SMALL LETTER W
0x78	0x0078	# LATIN SMALL LETTER X
0x79	0x0079	# LATIN SMALL LETTER Y
0x7A	0x007A	# LATIN SMALL LETTER Z
0x7B	0x007B	# LEFT CURLY BRACKET
0x7C	0x007C	# VERTICAL LINE
0x7D	0x007D	# RIGHT CURLY BRACKET
0x7E	0x007E	# TILDE
#
0x80	0x00AB	# LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0x81	0x00BB	# RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0x82	0x2026	# HORIZONTAL ELLIPSIS
0x83	0xF88C	# form for THAI CHARACTER MAI EK, low left position
0x84	0xF88F	# form for THAI CHARACTER MAI THO, low left position
0x85	0xF892	# form for THAI CHARACTER MAI TRI, low left position
0x86	0xF895	# form for THAI CHARACTER MAI CHATTAWA, low left position
0x87	0xF898	# form for THAI CHARACTER THANTHAKHAT, low left position
0x88	0xF88B	# form for THAI CHARACTER MAI EK, low position
0x89	0xF88E	# form for THAI CHARACTER MAI THO, low position
0x8A	0xF891	# form for THAI CHARACTER MAI TRI, low position
0x8B	0xF894	# form for THAI CHARACTER MAI CHATTAWA, low position
0x8C	0xF897	# form for THAI CHARACTER THANTHAKHAT, low position
0x8D	0x201C	# LEFT DOUBLE QUOTATION MARK
0x8E	0x201D	# RIGHT DOUBLE QUOTATION MARK
0x8F	0xF899	# form for THAI CHARACTER NIKHAHIT, left position
0x90	0xFFFD	# (not used, map to Unicode replacement character)
0x91	0x2022	# BULLET
0x92	0xF884	# form for THAI CHARACTER MAI HAN-AKAT, left position
0x93	0xF889	# form for THAI CHARACTER MAITAIKHU, left position
0x94	0xF885	# form for THAI CHARACTER SARA I, left position
0x95	0xF886	# form for THAI CHARACTER SARA II, left position
0x96	0xF887	# form for THAI CHARACTER SARA UE, left position
0x97	0xF888	# form for THAI CHARACTER SARA UEE, left position
0x98	0xF88A	# form for THAI CHARACTER MAI EK, left position
0x99	0xF88D	# form for THAI CHARACTER MAI THO, left position
0x9A	0xF890	# form for THAI CHARACTER MAI TRI, left position
0x9B	0xF893	# form for THAI CHARACTER MAI CHATTAWA, left position
0x9C	0xF896	# form for THAI CHARACTER THANTHAKHAT, left position
0x9D	0x2018	# LEFT SINGLE QUOTATION MARK
0x9E	0x2019	# RIGHT SINGLE QUOTATION MARK
0x9F	0xFFFD	# (not used, map to Unicode replacement character)
0xA0	0x00A0	# NO-BREAK SPACE
0xA1	0x0E01	# THAI CHARACTER KO KAI
0xA2	0x0E02	# THAI CHARACTER KHO KHAI
0xA3	0x0E03	# THAI CHARACTER KHO KHUAT
0xA4	0x0E04	# THAI CHARACTER KHO KHWAI
0xA5	0x0E05	# THAI CHARACTER KHO KHON
0xA6	0x0E06	# THAI CHARACTER KHO RAKHANG
0xA7	0x0E07	# THAI CHARACTER NGO NGU
0xA8	0x0E08	# THAI CHARACTER CHO CHAN
0xA9	0x0E09	# THAI CHARACTER CHO CHING
0xAA	0x0E0A	# THAI CHARACTER CHO CHANG
0xAB	0x0E0B	# THAI CHARACTER SO SO
0xAC	0x0E0C	# THAI CHARACTER CHO CHOE
0xAD	0x0E0D	# THAI CHARACTER YO YING
0xAE	0x0E0E	# THAI CHARACTER DO CHADA
0xAF	0x0E0F	# THAI CHARACTER TO PATAK
0xB0	0x0E10	# THAI CHARACTER THO THAN
0xB1	0x0E11	# THAI CHARACTER THO NANGMONTHO
0xB2	0x0E12	# THAI CHARACTER THO PHUTHAO
0xB3	0x0E13	# THAI CHARACTER NO NEN
0xB4	0x0E14	# THAI CHARACTER DO DEK
0xB5	0x0E15	# THAI CHARACTER TO TAO
0xB6	0x0E16	# THAI CHARACTER THO THUNG
0xB7	0x0E17	# THAI CHARACTER THO THAHAN
0xB8	0x0E18	# THAI CHARACTER THO THONG
0xB9	0x0E19	# THAI CHARACTER NO NU
0xBA	0x0E1A	# THAI CHARACTER BO BAIMAI
0xBB	0x0E1B	# THAI CHARACTER PO PLA
0xBC	0x0E1C	# THAI CHARACTER PHO PHUNG
0xBD	0x0E1D	# THAI CHARACTER FO FA
0xBE	0x0E1E	# THAI CHARACTER PHO PHAN
0xBF	0x0E1F	# THAI CHARACTER FO FAN
0xC0	0x0E20	# THAI CHARACTER PHO SAMPHAO
0xC1	0x0E21	# THAI CHARACTER MO MA
0xC2	0x0E22	# THAI CHARACTER YO YAK
0xC3	0x0E23	# THAI CHARACTER RO RUA
0xC4	0x0E24	# THAI CHARACTER RU
0xC5	0x0E25	# THAI CHARACTER LO LING
0xC6	0x0E26	# THAI CHARACTER LU
0xC7	0x0E27	# THAI CHARACTER WO WAEN
0xC8	0x0E28	# THAI CHARACTER SO SALA
0xC9	0x0E29	# THAI CHARACTER SO RUSI
0xCA	0x0E2A	# THAI CHARACTER SO SUA
0xCB	0x0E2B	# THAI CHARACTER HO HIP
0xCC	0x0E2C	# THAI CHARACTER LO CHULA
0xCD	0x0E2D	# THAI CHARACTER O ANG
0xCE	0x0E2E	# THAI CHARACTER HO NOKHUK
0xCF	0x0E2F	# THAI CHARACTER PAIYANNOI
0xD0	0x0E30	# THAI CHARACTER SARA A
0xD1	0x0E31	# THAI CHARACTER MAI HAN-AKAT
0xD2	0x0E32	# THAI CHARACTER SARA AA
0xD3	0x0E33	# THAI CHARACTER SARA AM
0xD4	0x0E34	# THAI CHARACTER SARA I
0xD5	0x0E35	# THAI CHARACTER SARA II
0xD6	0x0E36	# THAI CHARACTER SARA UE
0xD7	0x0E37	# THAI CHARACTER SARA UEE
0xD8	0x0E38	# THAI CHARACTER SARA U
0xD9	0x0E39	# THAI CHARACTER SARA UU
0xDA	0x0E3A	# THAI CHARACTER PHINTHU
0xDB	0xFEFF	# ZERO WIDTH NO-BREAK SPACE
0xDC	0x200B	# ZERO WIDTH SPACE
0xDD	0x2013	# EN DASH
0xDE	0x2014	# EM DASH
0xDF	0x0E3F	# THAI CURRENCY SYMBOL BAHT
0xE0	0x0E40	# THAI CHARACTER SARA E
0xE1	0x0E41	# THAI CHARACTER SARA AE
0xE2	0x0E42	# THAI CHARACTER SARA O
0xE3	0x0E43	# THAI CHARACTER SARA AI MAIMUAN
0xE4	0x0E44	# THAI CHARACTER SARA AI MAIMALAI
0xE5	0x0E45	# THAI CHARACTER LAKKHANGYAO
0xE6	0x0E46	# THAI CHARACTER MAIYAMOK
0xE7	0x0E47	# THAI CHARACTER MAITAIKHU
0xE8	0x0E48	# THAI CHARACTER MAI EK
0xE9	0x0E49	# THAI CHARACTER MAI THO
0xEA	0x0E4A	# THAI CHARACTER MAI TRI
0xEB	0x0E4B	# THAI CHARACTER MAI CHATTAWA
0xEC	0x0E4C	# THAI CHARACTER THANTHAKHAT
0xED	0x0E4D	# THAI CHARACTER NIKHAHIT
0xEE	0x2122	# TRADE MARK SIGN
0xEF	0x0E4F	# THAI CHARACTER FONGMAN
0xF0	0x0E50	# THAI DIGIT ZERO
0xF1	0x0E51	# THAI DIGIT ONE
0xF2	0x0E52	# THAI DIGIT TWO
0xF3	0x0E53	# THAI DIGIT THREE
0xF4	0x0E54	# THAI DIGIT FOUR
0xF5	0x0E55	# THAI DIGIT FIVE
0xF6	0x0E56	# THAI DIGIT SIX
0xF7	0x0E57	# THAI DIGIT SEVEN
0xF8	0x0E58	# THAI DIGIT EIGHT
0xF9	0x0E59	# THAI DIGIT NINE
0xFA	0x00AE	# REGISTERED SIGN
0xFB	0x00A9	# COPYRIGHT SIGN
0xFC	0xFFFD	# (not used, map to Unicode replacement character)
0xFD	0xFFFD	# (not used, map to Unicode replacement character)
0xFE	0xFFFD	# (not used, map to Unicode replacement character)
0xFF	0xFFFD	# (not used, map to Unicode replacement character)