diff options
Diffstat (limited to 'doc/encoding.n')
-rw-r--r-- | doc/encoding.n | 79 |
1 files changed, 0 insertions, 79 deletions
diff --git a/doc/encoding.n b/doc/encoding.n deleted file mode 100644 index fc6d4f7..0000000 --- a/doc/encoding.n +++ /dev/null @@ -1,79 +0,0 @@ -'\" -'\" Copyright (c) 1998 by Scriptics Corporation. -'\" -'\" See the file "license.terms" for information on usage and redistribution -'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. -'\" -'\" RCS: @(#) $Id: encoding.n,v 1.2 1999/04/16 00:46:34 stanton Exp $ -'\" -.so man.macros -.TH encoding n "8.1" Tcl "Tcl Built-In Commands" -.BS -.SH NAME -encoding \- Manipulate encodings -.SH SYNOPSIS -\fBencoding \fIoption\fR ?\fIarg arg ...\fR? -.BE - -.SH INTRODUCTION -.PP -Strings in Tcl are encoded using 16-bit Unicode characters. Different -operating system interfaces or applications may generate strings in -other encodings such as Shift-JIS. The \fBencoding\fR command helps -to bridge the gap between Unicode and these other formats. - -.SH DESCRIPTION -.PP -Performs one of several encoding related operations, depending on -\fIoption\fR. The legal \fIoption\fRs are: -.TP -\fBencoding convertfrom ?\fIencoding\fR? \fIdata\fR -Convert \fIdata\fR to Unicode from the specified \fIencoding\fR. The -characters in \fIdata\fR are treated as binary data where the lower -8-bits of each character is taken as a single byte. The resulting -sequence of bytes is treated as a string in the specified -\fIencoding\fR. If \fIencoding\fR is not specified, the current -system encoding is used. -.TP -\fBencoding convertto ?\fIencoding\fR? \fIstring\fR -Convert \fIstring\fR from Unicode to the specified \fIencoding\fR. -The result is a sequence of bytes that represents the converted -string. Each byte is stored in the lower 8-bits of a Unicode -character. If \fIencoding\fR is not specified, the current -system encoding is used. -.TP -\fBencoding names\fR -Returns a list containing the names of all of the encodings that are -currently available. -.TP -\fBencoding system\fR ?\fIencoding\fR? -Set the system encoding to \fIencoding\fR. If \fIencoding\fR is -omitted then the command returns the current system encoding. The -system encoding is used whenever Tcl passes strings to system calls. - -.SH EXAMPLE -.PP -It is common practice to write script files using a text editor that -produces output in the euc-jp encoding, which represents the ASCII -characters as singe bytes and Japanese characters as two bytes. This -makes it easy to embed literal strings that correspond to non-ASCII -characters by simply typing the strings in place in the script. -However, because the \fBsource\fR command always reads files using the -ISO8859-1 encoding, Tcl will treat each byte in the file as a separate -character that maps to the 00 page in Unicode. The -resulting Tcl strings will not contain the expected Japanese -characters. Instead, they will contain a sequence of Latin-1 -characters that correspond to the bytes of the original string. The -\fBencoding\fR command can be used to convert this string to the -expected Japanese Unicode characters. For example, -.CS - set s [encoding convertfrom euc-jp "\\xA4\\xCF"] -.CE -would return the Unicode string "\\u306F", which is the Hiragana -letter HA. - -.SH "SEE ALSO" -Tcl_GetEncoding - -.SH KEYWORDS -encoding |