summaryrefslogtreecommitdiffstats
path: root/doc/Utf.3
diff options
context:
space:
mode:
authorjan.nijtmans <nijtmans@users.sourceforge.net>2020-05-12 13:53:49 (GMT)
committerjan.nijtmans <nijtmans@users.sourceforge.net>2020-05-12 13:53:49 (GMT)
commit820821d220240c0b1cb5a6cf73e96b1cf5ddd4b7 (patch)
tree3e43bb681204da78ca9c69f70b8b95dcd8ac4b2e /doc/Utf.3
parent594b85ce18c43c1a0665f90f702fa3d0da4659cf (diff)
parent09f4817353f0b33e04d9866888e1408a16b1afa6 (diff)
downloadtcl-820821d220240c0b1cb5a6cf73e96b1cf5ddd4b7.zip
tcl-820821d220240c0b1cb5a6cf73e96b1cf5ddd4b7.tar.gz
tcl-820821d220240c0b1cb5a6cf73e96b1cf5ddd4b7.tar.bz2
Change back implementation of Tcl_UtfAtIndex() to how it was. Update documentation.
Diffstat (limited to 'doc/Utf.3')
-rw-r--r--doc/Utf.312
1 files changed, 7 insertions, 5 deletions
diff --git a/doc/Utf.3 b/doc/Utf.3
index 35f9327..76ac7e5 100644
--- a/doc/Utf.3
+++ b/doc/Utf.3
@@ -285,15 +285,17 @@ byte \fIsrc[0]\fR nor the byte \fIstart[-1]\fR nor the byte
Pascal Ord() function. It returns the Unicode character represented at the
specified character (not byte) \fIindex\fR in the UTF-8 string
\fIsrc\fR. The source string must contain at least \fIindex\fR
-characters. Behavior is undefined if a negative \fIindex\fR is given.
+characters. If a negative \fIindex\fR is given, the first Unicode
+character of the string is returned.
.PP
\fBTcl_UtfAtIndex\fR returns a pointer to the specified character (not
byte) \fIindex\fR in the UTF-8 string \fIsrc\fR. The source string must
contain at least \fIindex\fR characters. This is equivalent to calling
-\fBTcl_UtfToUniChar\fR \fIindex\fR times, except if the index points to
-a lower surrogate preceded by an upper surrogate: In that case, the returned
-pointer will point just after the lower surrogate. If a negative \fIindex\fR is given,
-the return pointer points to the first character in the source string.
+\fBTcl_UtfToUniChar\fR \fIindex\fR times, except if that would return
+a pointer to the second byte of a valid 4-byte UTF-8 sequence, in which
+case, \fBTcl_UtfToUniChar\fR will be called once more to find the end
+of the sequence. If a negative \fIindex\fR is given, the returned pointer
+points to the first character in the source string.
.PP
\fBTcl_UtfBackslash\fR is a utility procedure used by several of the Tcl
commands. It parses a backslash sequence and stores the properly formed