summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorjan.nijtmans <nijtmans@users.sourceforge.net>2017-11-29 08:59:49 (GMT)
committerjan.nijtmans <nijtmans@users.sourceforge.net>2017-11-29 08:59:49 (GMT)
commit3af16acbcb63ea2935d71b905371252560dc4659 (patch)
tree0a9f024f95deaffc4eeb967fec06b82d0229a34e /doc
parenteef0b72b5e12dededc794db29219be26574f6daf (diff)
downloadtcl-3af16acbcb63ea2935d71b905371252560dc4659.zip
tcl-3af16acbcb63ea2935d71b905371252560dc4659.tar.gz
tcl-3af16acbcb63ea2935d71b905371252560dc4659.tar.bz2
Treat invalid UTF-8 characters in the range 0x80-0x9F as cp1252: See [https://en.wikipedia.org/wiki/UTF-8]. To be added to TIP #389
Diffstat (limited to 'doc')
-rw-r--r--doc/Utf.33
1 files changed, 3 insertions, 0 deletions
diff --git a/doc/Utf.3 b/doc/Utf.3
index 638f349..de9545d 100644
--- a/doc/Utf.3
+++ b/doc/Utf.3
@@ -140,6 +140,9 @@ number of bytes read from \fIsrc\fR. The caller must ensure that the
source buffer is long enough such that this routine does not run off the
end and dereference non-existent or random memory; if the source buffer
is known to be null-terminated, this will not happen. If the input is
+a byte in the range 0x80 - 0x9F, \fBTcl_UtfToUniChar\fR assumes the
+cp1252 encoding, stores the corresponding Tcl_UniChar in \fI*chPtr\fR
+and returns 1. If the input is otherwise
not in proper UTF-8 format, \fBTcl_UtfToUniChar\fR will store the first
byte of \fIsrc\fR in \fI*chPtr\fR as a Tcl_UniChar between 0x0000 and
0x00ff and return 1.