summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorjan.nijtmans <nijtmans@users.sourceforge.net>2021-05-06 15:33:33 (GMT)
committerjan.nijtmans <nijtmans@users.sourceforge.net>2021-05-06 15:33:33 (GMT)
commitde61616a605c41f8963bfa44e43dc075e6a5a4c4 (patch)
treebbc8f3cd4260c71f12ab327b248ea4cf07b73bb6 /doc
parent82b2bfa1b8f90760f53b543c9dc7e4fa7c2e3510 (diff)
parent75afeeeffad7351134cc1d839d667ef7afca3579 (diff)
downloadtcl-de61616a605c41f8963bfa44e43dc075e6a5a4c4.zip
tcl-de61616a605c41f8963bfa44e43dc075e6a5a4c4.tar.gz
tcl-de61616a605c41f8963bfa44e43dc075e6a5a4c4.tar.bz2
Merge 8.7. Improve errormessage when handling byte-errors in channels to (e.g.): error writing "stdout": illegal byte sequence
Diffstat (limited to 'doc')
-rw-r--r--doc/encoding.n20
1 files changed, 2 insertions, 18 deletions
diff --git a/doc/encoding.n b/doc/encoding.n
index 5aac181..e78a8e7 100644
--- a/doc/encoding.n
+++ b/doc/encoding.n
@@ -81,29 +81,13 @@ omitted then the command returns the current system encoding. The
system encoding is used whenever Tcl passes strings to system calls.
.SH EXAMPLE
.PP
-It is common practice to write script files using a text editor that
-produces output in the euc-jp encoding, which represents the ASCII
-characters as singe bytes and Japanese characters as two bytes. This
-makes it easy to embed literal strings that correspond to non-ASCII
-characters by simply typing the strings in place in the script.
-However, because the \fBsource\fR command always reads files using the
-current system encoding, Tcl will only source such files correctly
-when the encoding used to write the file is the same. This tends not
-to be true in an internationalized setting. For example, if such a
-file was sourced in North America (where the ISO8859\-1 is normally
-used), each byte in the file would be treated as a separate character
-that maps to the 00 page in Unicode. The resulting Tcl strings will
-not contain the expected Japanese characters. Instead, they will
-contain a sequence of Latin-1 characters that correspond to the bytes
-of the original string. The \fBencoding\fR command can be used to
-convert this string to the expected Japanese Unicode characters. For
-example,
+The following example converts a byte sequence in Japanese euc-jp encoding to a TCL string:
.PP
.CS
set s [\fBencoding convertfrom\fR euc-jp "\exA4\exCF"]
.CE
.PP
-would return the Unicode string
+The result is the unicode codepoint:
.QW "\eu306F" ,
which is the Hiragana letter HA.
.SH "SEE ALSO"