summaryrefslogtreecommitdiffstats
path: root/doc/string.n
blob: f93e5515c21eb5c5fbbed68b9feea8d0e3df9e94 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
'\"
'\" Copyright (c) 1993 The Regents of the University of California.
'\" Copyright (c) 1994-1996 Sun Microsystems, Inc.
'\"
'\" See the file "license.terms" for information on usage and redistribution
'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES.
'\" 
'\" RCS: @(#) $Id: string.n,v 1.6 1999/05/05 01:19:43 stanton Exp $
'\" 
.so man.macros
.TH string n 8.1 Tcl "Tcl Built-In Commands"
.BS
'\" Note:  do not modify the .SH NAME line immediately below!
.SH NAME
string \- Manipulate strings
.SH SYNOPSIS
\fBstring \fIoption arg \fR?\fIarg ...?\fR
.BE

.SH DESCRIPTION
.PP
Performs one of several string operations, depending on \fIoption\fR.
The legal \fIoption\fRs (which may be abbreviated) are:
.VS 8.1
.TP
\fBstring bytelength \fIstring\fR
Returns a decimal string giving the number of bytes used to represent
\fIstring\fR in memory.  Because UTF-8 uses one to three bytes to
represent Unicode characters, the byte length will not be the same as
the character length in general.  The cases where a script cares about
the byte length are rare.  In almost all cases, you should use the
\fBstring length\fB operation.  Refer to the \fBTcl_NumUtfChars\fR
manual entry for more details on the UTF-8 representation.
.TP
\fBstring compare \fIstring1 string2\fR ?\fIlength\fR?
.VE 8.1
Perform a character-by-character comparison of strings \fIstring1\fR and
\fIstring2\fR in the same way as the C \fBstrcmp\fR procedure.  Return
\-1, 0, or 1, depending on whether \fIstring1\fR is lexicographically
less than, equal to, or greater than \fIstring2\fR.
.VS 8.1
If \fIlength\fR is
specified, it works like C \fBstrncmp\fR, comparing only to the
specified length.  If \fIlength\fR is negative, it is ignored.
.TP
\fBstring equal \fIstring1 string2\fR ?\fIlength\fR?
.VE 8.1
Like the \fBcompare\fR method, but returns 1 when the strings
are equal, or 0 when not.
.TP
\fBstring first \fIstring1 string2\fR
Search \fIstring2\fR for a sequence of characters that exactly match
the characters in \fIstring1\fR.  If found, return the index of the
first character in the first such match within \fIstring2\fR.  If not
found, return \-1.
.TP
\fBstring icompare \fIstring1 string2\fR ?\fIlength\fR?
Perform a case-insensitive character-by-character comparison of strings
\fIstring1\fR and
\fIstring2\fR in the same way as the C \fBstrcasecmp\fR procedure.  Return
\-1, 0, or 1, depending on whether \fIstring1\fR is lexicographically
less than, equal to, or greater than \fIstring2\fR.  If length is
specified, it works like C \fBstrncasecmp\fR, comparing only to the
specified length.
.TP
\fBstring iequal \fIstring1 string2\fR ?\fIlength\fR?
Like the \fBicompare\fR method, but returns 1 when the strings
are equal, or 0 when not.
.TP
\fBstring index \fIstring charIndex\fR
Returns the \fIcharIndex\fR'th character of the \fIstring\fR
argument.  A \fIcharIndex\fR of 0 corresponds to the first
character of the string.  
.VS 8.1
\fIcharIndex\fR may be specified as
follows:
.RS
.IP \fB[\fInumber\fB]\fR 10
The char specified at this numerical index
.IP \fBend\fR 10
The last char of the string.
.IP \fIexpression\fR 10
A Tcl expression that returns a number.
.IP \fBend[+-]\fIexpression\fR 10
The last char of the string plus or minus the number specified
in the expression (e.g. \fBend-1\fR).
.PP
.VE 8.1
If \fIcharIndex\fR is less than 0 or greater than
or equal to the length of the string then an empty string is
returned.
.RE
.TP
\fBstring last \fIstring1 string2\fR
Search \fIstring2\fR for a sequence of characters that exactly match
the characters in \fIstring1\fR.  If found, return the index of the
first character in the last such match within \fIstring2\fR.  If there
is no match, then return \-1.
.TP
\fBstring length \fIstring\fR
Returns a decimal string giving the number of characters in
\fIstring\fR.  Note that this is not necessarily the same as the
number of bytes used to store the string.
.VS 8.1
.TP
\fBstring map ?\fIoptions\fR? \fIcharMap string\fR
Replaces characters in \fIstring\fR based on the key-value pairs in
\fIcharMap\fR.  \fIcharMap\fR is a list of key value key value ... as
in the form returned by \fBarray get\fR.  Each instance of a key in
the string will be replace with its corresponding value.  Both key and
value may be multiple characters.  This is done
in an ordered manner, so the key appearing first in the list will be
checked first, and so on.  \fIstring\fR is only iterated over once,
so earlier key replacements will have no affect for later key matches.
For example,
.RS
.CS
\fBstring map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc\fR
.CE
will return the string \fB01321221\fR.
.RE
.VE 8.1
.TP
\fBstring match \fIpattern\fR \fIstring\fR
See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0
if it doesn't.  Matching is done in a fashion similar to that
used by the C-shell.  For the two strings to match, their contents
must be identical except that the following special sequences
may appear in \fIpattern\fR:
.RS
.IP \fB*\fR 10
Matches any sequence of characters in \fIstring\fR,
including a null string.
.IP \fB?\fR 10
Matches any single character in \fIstring\fR.
.IP \fB[\fIchars\fB]\fR 10
Matches any character in the set given by \fIchars\fR.  If a sequence
of the form
\fIx\fB\-\fIy\fR appears in \fIchars\fR, then any character
between \fIx\fR and \fIy\fR, inclusive, will match.
.IP \fB\e\fIx\fR 10
Matches the single character \fIx\fR.  This provides a way of
avoiding the special interpretation of the characters
\fB*?[]\e\fR in \fIpattern\fR.
.RE
.TP
\fBstring range \fIstring first last\fR
Returns a range of consecutive characters from \fIstring\fR, starting
with the character whose index is \fIfirst\fR and ending with the
character whose index is \fIlast\fR. An index of 0 refers to the
.VS 8.1
first character of the string.  \fIfirst\fR and \fIlast\fR may be
specified as for the \fBindex\fR method.
.VE 8.1
If \fIfirst\fR is less than zero then it is treated as if it were zero, and
if \fIlast\fR is greater than or equal to the length of the string then
it is treated as if it were \fBend\fR.  If \fIfirst\fR is greater than
\fIlast\fR then an empty string is returned.
.VS 8.1
.TP
\fBstring repeat \fIstring count\fR
Returns \fIstring\fR repeated \fIcount\fR number of times.
.TP
\fBstring replace \fIstring last\fR ?\fIstring\fR?
Removes a range of consecutive characters from \fIstring\fR, starting
with the character whose index is \fIfirst\fR and ending with the
character whose index is \fIlast\fR.  An index of 0 refers to the
first character of the string.  \fIfirst\fR and \fIlast\fR may be
specified as for the \fBindex\fR method.  If \fIstring\fR is
specified, then it is placed in the removed character range.
If \fIfirst\fR is less than zero then it is treated as if it were zero, and
if \fIlast\fR is greater than or equal to the length of the string then
it is treated as if it were \fBend\fR.  If \fIfirst\fR is greater than
\fIlast\fR or the length of the initial string, or \fIlast\fR is less
than 0, then the initial string is returned untouched.
.TP
\fBstring tolower \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR?
Returns a value equal to \fIstring\fR except that all upper (or title) case
letters have been converted to lower case.  If \fIfirst\fR is specified, it
refers to the first char index in the string to start modifying.  If
\fIlast\fR is specified, it refers to the char index in the string to stop
at (inclusive).  \fIfirst\fR and \fIlast\fR may be
specified as for the \fBindex\fR method.
.TP
\fBstring totitle \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR?
Returns a value equal to \fIstring\fR except that the first character
in \fIstring\fR is converted to its Unicode title case variant (or upper
case if there is no title case variant) and the rest of the string is
converted to lower case.  If \fIfirst\fR is specified, it
refers to the first char index in the string to start modifying.  If
\fIlast\fR is specified, it refers to the char index in the string to stop
at (inclusive).  \fIfirst\fR and \fIlast\fR may be
specified as for the \fBindex\fR method.
.TP
\fBstring toupper \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR?
Returns a value equal to \fIstring\fR except that all lower (or title) case
letters have been converted to upper case.  If \fIfirst\fR is specified, it
refers to the first char index in the string to start modifying.  If
\fIlast\fR is specified, it refers to the char index in the string to stop
at (inclusive).  \fIfirst\fR and \fIlast\fR may be specified as for the
\fBindex\fR method.
.VE 8.1
.TP
\fBstring trim \fIstring\fR ?\fIchars\fR?
Returns a value equal to \fIstring\fR except that any leading
or trailing characters from the set given by \fIchars\fR are
removed.
If \fIchars\fR is not specified then white space is removed
(spaces, tabs, newlines, and carriage returns).
.TP
\fBstring trimleft \fIstring\fR ?\fIchars\fR?
Returns a value equal to \fIstring\fR except that any
leading characters from the set given by \fIchars\fR are
removed.
If \fIchars\fR is not specified then white space is removed
(spaces, tabs, newlines, and carriage returns).
.TP
\fBstring trimright \fIstring\fR ?\fIchars\fR?
Returns a value equal to \fIstring\fR except that any
trailing characters from the set given by \fIchars\fR are
removed.
If \fIchars\fR is not specified then white space is removed
(spaces, tabs, newlines, and carriage returns).
.VS 8.1
.TP
\fBstring wordend \fIstring charIndex\fR
Returns the index of the character just after the last one in the word
containing character \fIcharIndex\fR of \fIstring\fR.  \fIcharIndex\fR
may be specified as for the \fBindex\fR method.  A word is
considered to be any contiguous range of alphanumeric (Unicode letters
or decimal digits) or underscore (Unicode connector punctuation)
characters, or any single character other than these.
.TP
\fBstring wordstart \fIstring charIndex\fR
Returns the index of the first character in the word containing
character \fIcharIndex\fR of \fIstring\fR.  \fIcharIndex\fR may be
specified as for the \fBindex\fR method.  A word is considered to be any
contiguous range of alphanumeric (Unicode letters or decimal digits)
or underscore (Unicode connector punctuation) characters, or any
single character other than these.
.VE 8.1

.SH KEYWORDS
case conversion, compare, index, match, pattern, string, word