summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorpooryorick <com.digitalsmarties@pooryorick.com>2024-05-14 07:12:59 (GMT)
committerpooryorick <com.digitalsmarties@pooryorick.com>2024-05-14 07:12:59 (GMT)
commit903211d3cfe6b551252fa47f68af0d46dab3ff60 (patch)
treedd15d1ba41680d04395079b58c13300e55be5bd0
parent5fcdf8c85e32d445b2e120b40ec7a62c5ba4f6d1 (diff)
parentd38fb59d1263822ad6b0953ccc049b52c1ac2c77 (diff)
downloadtcl-903211d3cfe6b551252fa47f68af0d46dab3ff60.zip
tcl-903211d3cfe6b551252fa47f68af0d46dab3ff60.tar.gz
tcl-903211d3cfe6b551252fa47f68af0d46dab3ff60.tar.bz2
Merge [4a1848c27fd63955], which was improperly backed-out (there was no notice
or public discussion).
-rw-r--r--doc/Tcl.n323
1 files changed, 194 insertions, 129 deletions
diff --git a/doc/Tcl.n b/doc/Tcl.n
index fbe77bc..0f784af 100644
--- a/doc/Tcl.n
+++ b/doc/Tcl.n
@@ -1,7 +1,6 @@
'\"
'\" Copyright (c) 1993 The Regents of the University of California.
'\" Copyright (c) 1994-1996 Sun Microsystems, Inc.
-'\" Copyright (c) 2023 Nathan Coulter
'\"
'\" See the file "license.terms" for information on usage and redistribution
'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
@@ -17,191 +16,257 @@ Summary of Tcl language syntax.
.SH DESCRIPTION
.PP
The following rules define the syntax and semantics of the Tcl language:
-.
-.IP "[1] \fBScript.\fR"
-A script is composed of zero or more commands delimited by semi-colons or
-newlines.
-.IP "[2] \fBCommand.\fR"
-A command is composed of zero or more words delimited by whitespace. The
-replacement for a substitution is included verbatim in the word. For example, a
-space in the replacement is included in the word rather than becoming a
-delimiter, and \fI\\\\\fR becomes a single backslash in the word. Each word is
-processed from left to right and each substitution is performed as soon as it
-is complete.
-For example, the command
-.RS
-.PP
-.CS
-set y [set x 0][incr x][incr x]
-.CE
-.PP
-is composed of three words, and sets the value of \fIy\fR to \fI012\fR.
-.PP
-If hash
-.PQ #
-is the first character of what would otherwise be the first word of a command,
-all characters up to the next newline are ignored.
-.RE
-.
-.IP "[3] \fBBraced word.\fR"
-If a word is enclosed in braces
-.PQ {
-and
-.PQ } ""
-, the braces are removed and the enclosed characters become the word. No
-substitutions are performed. Nested pairs of braces may occur within the word.
-A brace preceded by an odd number of backslashes is not considered part of a
-pair, and neither brace nor the backslashes are removed from the word.
-.
-.IP "[4] \fBQuoted word.\fR"
-If a word is enclosed in double quotes
+.IP "[1] \fBCommands.\fR"
+A Tcl script is a string containing one or more commands.
+Semi-colons and newlines are command separators unless quoted as
+described below.
+Close brackets are command terminators during command substitution
+(see below) unless quoted.
+.IP "[2] \fBEvaluation.\fR"
+A command is evaluated in two steps.
+First, the Tcl interpreter breaks the command into \fIwords\fR
+and performs substitutions as described below.
+These substitutions are performed in the same way for all
+commands.
+Secondly, the first word is used to locate a routine to
+carry out the command, and the remaining words of the command are
+passed to that routine.
+The routine is free to interpret each of its words
+in any way it likes, such as an integer, variable name, list,
+or Tcl script.
+Different commands interpret their words differently.
+.IP "[3] \fBWords.\fR"
+Words of a command are separated by white space (except for
+newlines, which are command separators).
+.IP "[4] \fBDouble quotes.\fR"
+If the first character of a word is double-quote
.PQ \N'34'
-, the double quotes are removed and the enclosed characters become the word.
-Substitutions are performed.
-.
-.IP "[5] \fBList.\fR"
-A list has the form of a single command. Newline is whitespace, and semicolon
-has no special interpretation. There is no script evaluation so there is no
-argument expansion, variable substitution, or command substitution: Dollar-sign
-and open bracket have no special interpretation, and what would be argument
-expansion in a script is invalid in a list.
-.
-.IP "[6] \fBArgument expansion.\fR"
-If
+then the word is terminated by the next double-quote character.
+If semi-colons, close brackets, or white space characters
+(including newlines) appear between the quotes then they are treated
+as ordinary characters and included in the word.
+Command substitution, variable substitution, and backslash substitution
+are performed on the characters between the quotes as described below.
+The double-quotes are not retained as part of the word.
+.IP "[5] \fBArgument expansion.\fR"
+If a word starts with the string
.QW {*}
-prefixes a word, it is removed. After any remaining enclosing braces or quotes
-are processed and applicable substitutions performed, the word, which must
-be a list, is removed from the command, and in its place each word in the
-list becomes an additional word in the command. For example,
-.CS
-cmd a {*}{b [c]} d {*}{$e f {g h}}
-.CE
+followed by a non-whitespace character, then the leading
+.QW {*}
+is removed and the rest of the word is parsed and substituted as any other
+word. After substitution, the word is parsed as a list (without command or
+variable substitutions; backslash substitutions are performed as is normal for
+a list and individual internal words may be surrounded by either braces or
+double-quote characters), and its words are added to the command being
+substituted. For instance,
+.QW "cmd a {*}{b [c]} d {*}{$e f {g h}}"
is equivalent to
-.CS
-cmd a b {[c]} d {$e} f {g h} .
-.CE
-.
-.IP "[7] \fBEvaluation.\fR"
-To evaluate a script, an interpreter evaluates each successive command. The
-first word identifies a procedure, and the remaining words are passed to that
-procedure for further evaluation. The procedure interprets each argument in
-its own way, e.g. as an integer, variable name, list, mathematical expression,
-script, or in some other arbitrary way. The result of the last command is the
-result of the script.
-.
-.IP "[8] \fBCommand substitution.\fR"
-Each pair of brackets
+.QW "cmd a b {[c]} d {$e} f {g h}" .
+.IP "[6] \fBBraces.\fR"
+If the first character of a word is an open brace
+.PQ {
+and rule [5] does not apply, then
+the word is terminated by the matching close brace
+.PQ } "" .
+Braces nest within the word: for each additional open
+brace there must be an additional close brace (however,
+if an open brace or close brace within the word is
+quoted with a backslash then it is not counted in locating the
+matching close brace).
+No substitutions are performed on the characters between the
+braces except for backslash-newline substitutions described
+below, nor do semi-colons, newlines, close brackets,
+or white space receive any special interpretation.
+The word will consist of exactly the characters between the
+outer braces, not including the braces themselves.
+.IP "[7] \fBCommand substitution.\fR"
+If a word contains an open bracket
.PQ [
-and
-.PQ ] ""
-encloses a script and is replaced by the result of that script.
-.IP "[9] \fBVariable substitution.\fR"
-Each of the following forms begins with dollar sign
+then Tcl performs \fIcommand substitution\fR.
+To do this it invokes the Tcl interpreter recursively to process
+the characters following the open bracket as a Tcl script.
+The script may contain any number of commands and must be terminated
+by a close bracket
+.PQ ] "" .
+The result of the script (i.e. the result of its last command) is
+substituted into the word in place of the brackets and all of the
+characters between them.
+There may be any number of command substitutions in a single word.
+Command substitution is not performed on words enclosed in braces.
+.IP "[8] \fBVariable substitution.\fR"
+If a word contains a dollar-sign
.PQ $
-and is replaced by the value of the identified variable. \fIname\fR names the
-variable and is composed of ASCII letters (\fBA\fR\(en\fBZ\fR and
-\fBa\fR\(en\fBz\fR), digits (\fB0\fR\(en\fB9\fR), underscores, or namespace
-delimiters (two or more colons). \fIindex\fR is the name of an individual
-variable within an array variable, and may be empty.
+followed by one of the forms
+described below, then Tcl performs \fIvariable
+substitution\fR: the dollar-sign and the following characters are
+replaced in the word by the value of a variable.
+Variable substitution may take any of the following forms:
.RS
.TP 15
\fB$\fIname\fR
.
-\fIname\fR may not be empty.
+\fIName\fR is the name of a scalar variable; the name is a sequence
+of one or more characters that are a letter, digit, underscore,
+or namespace separators (two or more colons).
+Letters and digits are \fIonly\fR the standard ASCII ones (\fB0\fR\(en\fB9\fR,
+\fBA\fR\(en\fBZ\fR and \fBa\fR\(en\fBz\fR).
.TP 15
\fB$\fIname\fB(\fIindex\fB)\fR
.
-\fIname\fR may be empty. Substitutions are performed on \fIindex\fR.
+\fIName\fR gives the name of an array variable and \fIindex\fR gives
+the name of an element within that array.
+\fIName\fR must contain only letters, digits, underscores, and
+namespace separators, and may be an empty string.
+Letters and digits are \fIonly\fR the standard ASCII ones (\fB0\fR\(en\fB9\fR,
+\fBA\fR\(en\fBZ\fR and \fBa\fR\(en\fBz\fR).
+Command substitutions, variable substitutions, and backslash
+substitutions are performed on the characters of \fIindex\fR.
.TP 15
\fB${\fIname\fB}\fR
.
-\fIname\fR may be empty.
-.TP 15
-\fB${\fIname(index)\fB}\fR
-.
-\fIname\fR may be empty. No substitutions are performed.
+\fIName\fR is the name of a scalar variable or array element. It may contain
+any characters whatsoever except for close braces. It indicates an array
+element if \fIname\fR is in the form
+.QW \fIarrayName\fB(\fIindex\fB)\fR
+where \fIarrayName\fR does not contain any open parenthesis characters,
+.QW \fB(\fR ,
+or close brace characters,
+.QW \fB}\fR ,
+and \fIindex\fR can be any sequence of characters except for close brace
+characters. No further
+substitutions are performed during the parsing of \fIname\fR.
+.PP
+There may be any number of variable substitutions in a single word.
+Variable substitution is not performed on words enclosed in braces.
+.PP
+Note that variables may contain character sequences other than those listed
+above, but in that case other mechanisms must be used to access them (e.g.,
+via the \fBset\fR command's single-argument form).
.RE
-Variables that are not accessible through one of the forms above may be
-accessed through other mechanisms, e.g. the \fBset\fR command.
-.IP "[10] \fBBackslash substitution.\fR"
-Each backslash
+.IP "[9] \fBBackslash substitution.\fR"
+If a backslash
.PQ \e
-that is not part of one of the forms listed below is removed, and the next
-character is included in the word verbatim, which allows the inclusion of
-characters that would normally be interpreted, namely whitespace, braces,
-brackets, double quote, dollar sign, and backslash. The following sequences
-are replaced as described:
+appears within a word then \fIbackslash substitution\fR occurs.
+In all cases but those described below the backslash is dropped and
+the following character is treated as an ordinary
+character and included in the word.
+This allows characters such as double quotes, close brackets,
+and dollar signs to be included in words without triggering
+special processing.
+The following table lists the backslash sequences that are
+handled specially, along with the value that replaces each sequence.
.RS
.RS
.RS
.TP 7
\e\fBa\fR
-.
-Audible alert (bell) (U+7).
+Audible alert (bell) (Unicode U+000007).
.TP 7
\e\fBb\fR
-.
-Backspace (U+8).
+Backspace (Unicode U+000008).
.TP 7
\e\fBf\fR
-.
-Form feed (U+C).
+Form feed (Unicode U+00000C).
.TP 7
\e\fBn\fR
-.
-Newline (U+A).
+Newline (Unicode U+00000A).
.TP 7
\e\fBr\fR
-.
-Carriage-return (U+D).
+Carriage-return (Unicode U+00000D).
.TP 7
\e\fBt\fR
-.
-Tab (U+9).
+Tab (Unicode U+000009).
.TP 7
\e\fBv\fR
-.
-Vertical tab (U+B).
+Vertical tab (Unicode U+00000B).
.TP 7
\e\fB<newline>\fIwhiteSpace\fR
.
-Newline preceded by an odd number of backslashes, along with the consecutive
-spaces and tabs that immediately follow it, is replaced by a single space.
-Because this happens before the command is split into words, it occurs even
-within braced words, and if the resulting space may subsequently be treated as
-a word delimiter.
+A single space character replaces the backslash, newline, and all spaces
+and tabs after the newline. This backslash sequence is unique in that it
+is replaced in a separate pre-pass before the command is actually parsed.
+This means that it will be replaced even when it occurs between braces,
+and the resulting space will be treated as a word separator if it is not
+in braces or quotes.
.TP 7
\e\e
-.
Backslash
.PQ \e "" .
.TP 7
\e\fIooo\fR
.
-Up to three octal digits form an eight-bit value for a Unicode character in the
-range \fI0\fR\(en\fI377\fR, i.e. U+0\(enU+FF. Only the digits that result in a
-number in this range are consumed.
+The digits \fIooo\fR (one, two, or three of them) give a eight-bit octal
+value for the Unicode character that will be inserted, in the range
+\fI000\fR\(en\fI377\fR (i.e., the range U+000000\(enU+0000FF).
+The parser will stop just before this range overflows, or when
+the maximum of three digits is reached. The upper bits of the Unicode
+character will be 0.
.TP 7
\e\fBx\fIhh\fR
.
-Up to two hexadecimal digits form an eight-bit value for a Unicode character in
-the range \fI0\fR\(en\fIFF\fR.
+The hexadecimal digits \fIhh\fR (one or two of them) give an eight-bit
+hexadecimal value for the Unicode character that will be inserted. The upper
+bits of the Unicode character will be 0 (i.e., the character will be in the
+range U+000000\(enU+0000FF).
.TP 7
\e\fBu\fIhhhh\fR
.
-Up to four hexadecimal digits form a 16-bit value for a Unicode character in
-the range \fI0\fR\(en\fIFFFF\fR.
+The hexadecimal digits \fIhhhh\fR (one, two, three, or four of them) give a
+sixteen-bit hexadecimal value for the Unicode character that will be
+inserted. The upper bits of the Unicode character will be 0 (i.e., the
+character will be in the range U+000000\(enU+00FFFF).
.TP 7
\e\fBU\fIhhhhhhhh\fR
.
-Up to eight hexadecimal digits form a 21-bit value for a Unicode character in
-the range \fI0\fR\(en\fI10FFFF\fR. Only the digits that result in a number in
-this range are consumed.
+The hexadecimal digits \fIhhhhhhhh\fR (one up to eight of them) give a
+twenty-one-bit hexadecimal value for the Unicode character that will be
+inserted, in the range U+000000\(enU+10FFFF. The parser will stop just
+before this range overflows, or when the maximum of eight digits
+is reached. The upper bits of the Unicode character will be 0.
.RE
.RE
.PP
+Backslash substitution is not performed on words enclosed in braces,
+except for backslash-newline as described above.
.RE
-.
+.IP "[10] \fBComments.\fR"
+If a hash character
+.PQ #
+appears at a point where Tcl is
+expecting the first character of the first word of a command,
+then the hash character and the characters that follow it, up
+through the next newline, are treated as a comment and ignored.
+The comment character only has significance when it appears
+at the beginning of a command.
+.IP "[11] \fBOrder of substitution.\fR"
+Each character is processed exactly once by the Tcl interpreter
+as part of creating the words of a command.
+For example, if variable substitution occurs then no further
+substitutions are performed on the value of the variable; the
+value is inserted into the word verbatim.
+If command substitution occurs then the nested command is
+processed entirely by the recursive call to the Tcl interpreter;
+no substitutions are performed before making the recursive
+call and no additional substitutions are performed on the result
+of the nested script.
+.RS
+.PP
+Substitutions take place from left to right, and each substitution is
+evaluated completely before attempting to evaluate the next. Thus, a
+sequence like
+.PP
+.CS
+set y [set x 0][incr x][incr x]
+.CE
+.PP
+will always set the variable \fIy\fR to the value, \fI012\fR.
+.RE
+.IP "[12] \fBSubstitution and word boundaries.\fR"
+Substitutions do not affect the word boundaries of a command,
+except for argument expansion as specified in rule [5].
+For example, during variable substitution the entire value of
+the variable becomes part of a single word, even if the variable's
+value contains spaces.
.SH KEYWORDS
backslash, command, comment, script, substitution, variable
'\" Local Variables: