summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorjan.nijtmans <nijtmans@users.sourceforge.net>2021-03-17 12:04:05 (GMT)
committerjan.nijtmans <nijtmans@users.sourceforge.net>2021-03-17 12:04:05 (GMT)
commit9d25ea9cd7a4e93d8870ddcb15b4677e5d908e2e (patch)
tree3dc363761df05c92caaa06963e5afd91579af5e2 /doc
parentb51e777fe970a9ffcf0919f7262a6d474aeeaf2b (diff)
parent15849c859a2ac4cde3f65a071a4b8c47f3f2add0 (diff)
downloadtcl-9d25ea9cd7a4e93d8870ddcb15b4677e5d908e2e.zip
tcl-9d25ea9cd7a4e93d8870ddcb15b4677e5d908e2e.tar.gz
tcl-9d25ea9cd7a4e93d8870ddcb15b4677e5d908e2e.tar.bz2
Merge 8.7
Diffstat (limited to 'doc')
-rw-r--r--doc/Utf.315
1 files changed, 8 insertions, 7 deletions
diff --git a/doc/Utf.3 b/doc/Utf.3
index 263d4dd..f1aca4c 100644
--- a/doc/Utf.3
+++ b/doc/Utf.3
@@ -233,10 +233,10 @@ characters.
.PP
\fBTcl_UtfCharComplete\fR returns 1 if the source UTF-8 string \fIsrc\fR
of \fIlength\fR bytes is long enough to be decoded by
-\fBTcl_UtfToUniChar\fR, or 0 otherwise. This function does not guarantee
-that the UTF-8 string is properly formed. This routine is used by
-procedures that are operating on a byte at a time and need to know if a
-full Unicode character has been seen.
+\fBTcl_UtfToUniChar\fR/\fBTcl_UtfNext\fR, or 0 otherwise. This function
+does not guarantee that the UTF-8 string is properly formed. This routine
+is used by procedures that are operating on a byte at a time and need to
+know if a full Unicode character has been seen.
.PP
\fBTcl_NumUtfChars\fR corresponds to \fBstrlen\fR for UTF-8 strings. It
returns the number of Tcl_UniChars that are represented by the UTF-8 string
@@ -257,7 +257,8 @@ Given \fIsrc\fR, a pointer to some location in a UTF-8 string,
\fBTcl_UtfNext\fR returns a pointer to the next UTF-8 character in the
string. The caller must not ask for the next character after the last
character in the string if the string is not terminated by a null
-character.
+character. \fBTcl_UtfCharComplete\fR can be used in that case to
+make sure enough bytes are available before calling \fBTcl_UtfNext\fR.
.PP
\fBTcl_UtfPrev\fR is used to step backward through but not beyond the
UTF-8 string that begins at \fIstart\fR. If the UTF-8 string is made
@@ -274,12 +275,12 @@ always a pointer to a location in the string. It always returns a pointer to
a byte that begins a character when scanning for characters beginning
from \fIstart\fR. When \fIsrc\fR is greater than \fIstart\fR, it
always returns a pointer less than \fIsrc\fR and greater than or
-equal to (\fIsrc\fR - \fBTCL_UTF_MAX\fR). The character that begins
+equal to (\fIsrc\fR - 4). The character that begins
at the returned pointer is the first one that either includes the
byte \fIsrc[-1]\fR, or might include it if the right trail bytes are
present at \fIsrc\fR and greater. \fBTcl_UtfPrev\fR never reads the
byte \fIsrc[0]\fR nor the byte \fIstart[-1]\fR nor the byte
-\fIsrc[-\fBTCL_UTF_MAX\fI-1]\fR.
+\fIsrc[-5]\fR.
.PP
\fBTcl_UniCharAtIndex\fR corresponds to a C string array dereference or the
Pascal Ord() function. It returns the Unicode character represented at the