1 files changed, 72 insertions, 36 deletions
diff --git a/doc/encoding.n b/doc/encoding.n
index c1dbf27..bbe197d 100644
--- a/doc/encoding.n
+++ b/doc/encoding.n
@@ -28,30 +28,39 @@ formats.
 Performs one of several encoding related operations, depending on
 \fIoption\fR.  The legal \fIoption\fRs are:
 .TP
-\fBencoding convertfrom\fR ?\fB-nocomplain\fR? ?\fB-failindex var\fR?
-?\fIencoding\fR? \fIdata\fR
+\fBencoding convertfrom\fR ?\fB-nocomplain\fR? ?\fB-failindex var\fR? ?\fB-strict\fR? ?\fIencoding\fR? \fIdata\fR
 .
 Convert \fIdata\fR to a Unicode string from the specified \fIencoding\fR.  The
 characters in \fIdata\fR are 8 bit binary data.  The resulting
 sequence of bytes is a string created by applying the given \fIencoding\fR
 to the data. If \fIencoding\fR is not specified, the current
 system encoding is used.
-.
-The call fails on convertion errors, like an incomplete utf-8 sequence.
-The option \fB-failindex\fR is followed by a variable name. The variable
-is set to \fI-1\fR if no conversion error occured. It is set to the
-first error location in \fIdata\fR in case of a conversion error. All data
-until this error location is transformed and retured. This option may not
-be used together with \fB-nocomplain\fR.
-.
-The call does not fail on conversion errors, if the option
-\fB-nocomplain\fR is given. In this case, any error locations are replaced
-by \fB?\fR. Incomplete sequences are written verbatim to the output string.
-The purpose of this switch is to gain compatibility to prior versions of TCL.
-It is not recommended for any other usage.
+.VS "TCL8.7 TIP346, TIP607, TIP601"
+.PP
+.RS
+The command does not fail on encoding errors.  Instead, any not convertable bytes
+(like incomplete UTF-8  sequences, see example below) are put as byte values into
+the output stream.
+.PP
+If the option \fB-failindex\fR with a variable name is given, the error reporting
+is changed in the following manner:
+in case of a conversion error, the position of the input byte causing the error
+is returned in the given variable.  The return value of the command are the
+converted characters until the first error position.
+In case of no error, the value \fI-1\fR is written to the variable.  This option
+may not be used together with \fB-nocomplain\fR.
+.PP
+The option \fB-nocomplain\fR has no effect and is available for compatibility with
+TCL 9. In TCL 9, the encoding command fails with an error on any encoding issue.
+This switch restores the TCL8.7 behaviour.
+.PP
+The \fB-strict\fR option followes more strict rules in conversion.  Currently, only
+the sequence \fB\\xC0\\x80\fR in \fButf-8\fR encoding is disallowed.  Additional rules
+may follow.
+.VE "TCL8.7 TIP346, TIP607, TIP601"
+.RE
 .TP
-\fBencoding convertto\fR ?\fB-nocomplain\fR? ?\fB-failindex var\fR?
-?\fIencoding\fR? \fIstring\fR
+\fBencoding convertto\fR ?\fB-nocomplain\fR? ?\fB-failindex var\fR? ?\fB-strict\fR? ?\fIencoding\fR? \fIstring\fR
 .
 Convert \fIstring\fR from Unicode to the specified \fIencoding\fR.
 The result is a sequence of bytes that represents the converted
@@ -59,21 +68,29 @@ string.  Each byte is stored in the lower 8-bits of a Unicode
 character (indeed, the resulting string is a binary string as far as
 Tcl is concerned, at least initially).  If \fIencoding\fR is not
 specified, the current system encoding is used.
-.
-The call fails on convertion errors, like a Unicode character not representable
-in the given \fIencoding\fR.
-.
-The option \fB-failindex\fR is followed by a variable name. The variable
-is set to \fI-1\fR if no conversion error occured. It is set to the
-first error location in \fIdata\fR in case of a conversion error. All data
-until this error location is transformed and retured. This option may not
-be used together with \fB-nocomplain\fR.
-.
-The call does not fail on conversion errors, if the option
-\fB-nocomplain\fR is given. In this case, any error locations are replaced
-by \fB?\fR. Incomplete sequences are written verbatim to the output string.
-The purpose of this switch is to gain compatibility to prior versions of TCL.
-It is not recommended for any other usage.
+.VS "TCL8.7 TIP346, TIP607, TIP601"
+.PP
+.RS
+The command does not fail on encoding errors.  Instead, the replacement character
+\fB?\fR is output for any not representable character (like the dot \fB\\U2022\fR
+in \fBiso-8859-1\fR encoding, see example below).
+.PP
+If the option \fB-failindex\fR with a variable name is given, the error reporting
+is changed in the following manner:
+in case of a conversion error, the position of the input character causing the error
+is returned in the given variable.  The return value of the command are the
+converted bytes until the first error position. No error condition is raised.
+In case of no error, the value \fI-1\fR is written to the variable.  This option
+may not be used together with \fB-nocomplain\fR.
+.PP
+The option \fB-nocomplain\fR has no effect and is available for compatibility with
+TCL 9. In TCL 9, the encoding command fails with an error on any encoding issue.
+This switch restores the TCL8.7 behaviour.
+.PP
+The \fB-strict\fR option followes more strict rules in conversion.  Currently, it has
+no effect but may be used in future to add additional encoding checks.
+.VE "TCL8.7 TIP346, TIP607, TIP601"
+.RE
 .TP
 \fBencoding dirs\fR ?\fIdirectoryList\fR?
 .
@@ -104,7 +121,7 @@ omitted then the command returns the current system encoding.  The
 system encoding is used whenever Tcl passes strings to system calls.
 .SH EXAMPLE
 .PP
-The following example converts a byte sequence in Japanese euc-jp encoding to a TCL string:
+Example 1: convert a byte sequence in Japanese euc-jp encoding to a TCL string:
 .PP
 .CS
 set s [\fBencoding convertfrom\fR euc-jp "\exA4\exCF"]
@@ -113,8 +130,9 @@ set s [\fBencoding convertfrom\fR euc-jp "\exA4\exCF"]
 The result is the unicode codepoint:
 .QW "\eu306F" ,
 which is the Hiragana letter HA.
+.VS "TCL8.7 TIP346, TIP607, TIP601"
 .PP
-The following example detects the error location in an incomplete UTF-8 sequence:
+Example 2: detect the error location in an incomplete UTF-8 sequence:
 .PP
 .CS
 % set s [\fBencoding convertfrom\fR -failindex i utf-8 "A\exC3"]
@@ -123,7 +141,15 @@ A
 1
 .CE
 .PP
-The following example detects the error location while transforming to ISO8859-1
+Example 3: return the incomplete UTF-8 sequence by raw bytes:
+.PP
+.CS
+% set s [\fBencoding convertfrom\fR -nocomplain utf-8 "A\exC3"]
+.CE
+The result is "A" followed by the byte \exC3.  The option \fB-nocomplain\fR
+has no effect, but assures to get the same result with TCL9.
+.PP
+Example 4: detect the error location while transforming to ISO8859-1
 (ISO-Latin 1):
 .PP
 .CS
@@ -133,8 +159,18 @@ A
 1
 .CE
 .PP
+Example 5: replace a not representable character by the replacement character:
+.PP
+.CS
+% set s [\fBencoding convertto\fR -nocomplain utf-8 "A\eu0141"]
+A?
+.CE
+The option \fB-nocomplain\fR has no effect, but assures to get the same result
+with TCL9.
+.VE "TCL8.7 TIP346, TIP607, TIP601"
+.PP
 .SH "SEE ALSO"
-Tcl_GetEncoding(3)
+Tcl_GetEncoding(3), fconfigure(n)
 .SH KEYWORDS
 encoding, unicode
 .\" Local Variables: