'\" '\" Copyright (c) 1993 The Regents of the University of California. '\" Copyright (c) 1994-1996 Sun Microsystems, Inc. '\" '\" See the file "license.terms" for information on usage and redistribution '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. '\" '\" RCS: @(#) $Id: string.n,v 1.7 1999/05/06 18:46:42 stanton Exp $ '\" .so man.macros .TH string n 8.1 Tcl "Tcl Built-In Commands" .BS '\" Note: do not modify the .SH NAME line immediately below! .SH NAME string \- Manipulate strings .SH SYNOPSIS \fBstring \fIoption arg \fR?\fIarg ...?\fR .BE .SH DESCRIPTION .PP Performs one of several string operations, depending on \fIoption\fR. The legal \fIoption\fRs (which may be abbreviated) are: .VS 8.1 .TP \fBstring bytelength \fIstring\fR Returns a decimal string giving the number of bytes used to represent \fIstring\fR in memory. Because UTF-8 uses one to three bytes to represent Unicode characters, the byte length will not be the same as the character length in general. The cases where a script cares about the byte length are rare. In almost all cases, you should use the \fBstring length\fB operation. Refer to the \fBTcl_NumUtfChars\fR manual entry for more details on the UTF-8 representation. .TP \fBstring compare ?\fI-nocase\fR? ?\fI-length int\fR? \fIstring1 string2\fR .VE 8.1 Perform a character-by-character comparison of strings \fIstring1\fR and \fIstring2\fR in the same way as the C \fBstrcmp\fR procedure. Return \-1, 0, or 1, depending on whether \fIstring1\fR is lexicographically less than, equal to, or greater than \fIstring2\fR. .VS 8.1 If \fI-length\fR is specified, it works like C \fBstrncmp\fR, comparing only to the specified length. If \fI-length\fR is negative, it is ignored. If \fI-nocase\fR is specified, then the strings are compared in a case-insensitive manner. .TP \fBstring equal ?\fI-nocase\fR? ?\fI-length int\fR? \fIstring1 string2\fR .VE 8.1 Like the \fBcompare\fR method, but returns 1 when the strings are equal, or 0 when not. .TP \fBstring first \fIstring1 string2\fR Search \fIstring2\fR for a sequence of characters that exactly match the characters in \fIstring1\fR. If found, return the index of the first character in the first such match within \fIstring2\fR. If not found, return \-1. .TP \fBstring index \fIstring charIndex\fR Returns the \fIcharIndex\fR'th character of the \fIstring\fR argument. A \fIcharIndex\fR of 0 corresponds to the first character of the string. .VS 8.1 \fIcharIndex\fR may be specified as follows: .RS .IP \fB[\fIinteger\fB]\fR 10 The char specified at this integral index .IP \fBend\fR 10 The last char of the string. .IP \fIexpression\fR 10 A Tcl expression that returns a number. .IP \fBend-\fIinteger\fR 10 The last char of the string minus the specified integer offset (e.g. \fBend-1\fR). .PP .VE 8.1 If \fIcharIndex\fR is less than 0 or greater than or equal to the length of the string then an empty string is returned. .RE .VS 8.1 .TP \fBstring is \fIclass\fR ?\fI-strict\fR? ?\fI-failindex varname\fR? \fIstring\fR See if \fIstring\fR is a valid form of the specified class. If \fI-strict\fR is specified, then an empty string returns 0, otherwise and empty string will return 1 on any class. If \fI-failindex\fR is specified, then if the function returns 0, the index in the string where the class was no longer valid will be stored in the variable named \fIvarname\fR. The \fIvarname\fR will not be set if the function returns 1. The following class definitions are allowed (the class name can be abbreviated): .RS .IP \fBalnum\fR 10 Any Unicode alphabet or digit character. .IP \fBalpha\fR 10 Any Unicode alphabet character. .IP \fBascii\fR 10 Any character with a value less than \\u0080 (those that are in the 7-bit ascii range). .IP \fBboolean\fR 10 Any of the forms allowed to Tcl_GetBoolean. .IP \fBdigit\fR 10 Any Unicode digit character. .IP \fBdouble\fR 10 Any of the valid forms for a double in Tcl, with optional surrounding whitespace. In case of under/overflow in the value, 0 is returned and the \fIvarname\fR will contain -1. .IP \fBfalse\fR 10 Any of the forms allowed to Tcl_GetBoolean where the value is false. .IP \fBinteger\fR 10 Any of the valid forms for an integer in Tcl, with optional surrounding whitespace. In case of under/overflow in the value, 0 is returned and the \fIvarname\fR will contain -1. .IP \fBlower\fR 10 Any Unicode lower case alphabet character. .IP \fBspace\fR 10 Any Unicode space character. .IP \fBtrue\fR 10 Any of the forms allowed to Tcl_GetBoolean where the value is true. .IP \fBupper\fR 10 Any upper case alphabet character in the Unicode character set. .IP \fBwordchar\fR 10 Any Unicode word character. That is any alphanumeric character, and any Unicode connector punctuation characters (ie: underscore). .RE In the case of \fBboolean\fR, \fBtrue\fR and \fBfalse\fR, if the function will return 0, the \fIvarname\fR will always be set to 0, due to the varied nature of a valid boolean value. .VE 8.1 .TP \fBstring last \fIstring1 string2\fR Search \fIstring2\fR for a sequence of characters that exactly match the characters in \fIstring1\fR. If found, return the index of the first character in the last such match within \fIstring2\fR. If there is no match, then return \-1. .TP \fBstring length \fIstring\fR Returns a decimal string giving the number of characters in \fIstring\fR. Note that this is not necessarily the same as the number of bytes used to store the string. .VS 8.1 .TP \fBstring map ?\fIoptions\fR? \fIcharMap string\fR Replaces characters in \fIstring\fR based on the key-value pairs in \fIcharMap\fR. \fIcharMap\fR is a list of key value key value ... as in the form returned by \fBarray get\fR. Each instance of a key in the string will be replace with its corresponding value. Both key and value may be multiple characters. This is done in an ordered manner, so the key appearing first in the list will be checked first, and so on. \fIstring\fR is only iterated over once, so earlier key replacements will have no affect for later key matches. For example, .RS .CS \fBstring map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc\fR .CE will return the string \fB01321221\fR. .RE .VE 8.1 .TP \fBstring match \fIpattern\fR \fIstring\fR See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0 if it doesn't. Matching is done in a fashion similar to that used by the C-shell. For the two strings to match, their contents must be identical except that the following special sequences may appear in \fIpattern\fR: .RS .IP \fB*\fR 10 Matches any sequence of characters in \fIstring\fR, including a null string. .IP \fB?\fR 10 Matches any single character in \fIstring\fR. .IP \fB[\fIchars\fB]\fR 10 Matches any character in the set given by \fIchars\fR. If a sequence of the form \fIx\fB\-\fIy\fR appears in \fIchars\fR, then any character between \fIx\fR and \fIy\fR, inclusive, will match. .IP \fB\e\fIx\fR 10 Matches the single character \fIx\fR. This provides a way of avoiding the special interpretation of the characters \fB*?[]\e\fR in \fIpattern\fR. .RE .TP \fBstring range \fIstring first last\fR Returns a range of consecutive characters from \fIstring\fR, starting with the character whose index is \fIfirst\fR and ending with the character whose index is \fIlast\fR. An index of 0 refers to the .VS 8.1 first character of the string. \fIfirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. .VE 8.1 If \fIfirst\fR is less than zero then it is treated as if it were zero, and if \fIlast\fR is greater than or equal to the length of the string then it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than \fIlast\fR then an empty string is returned. .VS 8.1 .TP \fBstring repeat \fIstring count\fR Returns \fIstring\fR repeated \fIcount\fR number of times. .TP \fBstring replace \fIstring last\fR ?\fIstring\fR? Removes a range of consecutive characters from \fIstring\fR, starting with the character whose index is \fIfirst\fR and ending with the character whose index is \fIlast\fR. An index of 0 refers to the first character of the string. \fIfirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. If \fIstring\fR is specified, then it is placed in the removed character range. If \fIfirst\fR is less than zero then it is treated as if it were zero, and if \fIlast\fR is greater than or equal to the length of the string then it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than \fIlast\fR or the length of the initial string, or \fIlast\fR is less than 0, then the initial string is returned untouched. .TP \fBstring tolower \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? Returns a value equal to \fIstring\fR except that all upper (or title) case letters have been converted to lower case. If \fIfirst\fR is specified, it refers to the first char index in the string to start modifying. If \fIlast\fR is specified, it refers to the char index in the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. .TP \fBstring totitle \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? Returns a value equal to \fIstring\fR except that the first character in \fIstring\fR is converted to its Unicode title case variant (or upper case if there is no title case variant) and the rest of the string is converted to lower case. If \fIfirst\fR is specified, it refers to the first char index in the string to start modifying. If \fIlast\fR is specified, it refers to the char index in the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. .TP \fBstring toupper \fIstring\fR ?\fIfirst\fR? ?\fIlast\fR? Returns a value equal to \fIstring\fR except that all lower (or title) case letters have been converted to upper case. If \fIfirst\fR is specified, it refers to the first char index in the string to start modifying. If \fIlast\fR is specified, it refers to the char index in the string to stop at (inclusive). \fIfirst\fR and \fIlast\fR may be specified as for the \fBindex\fR method. .VE 8.1 .TP \fBstring trim \fIstring\fR ?\fIchars\fR? Returns a value equal to \fIstring\fR except that any leading or trailing characters from the set given by \fIchars\fR are removed. If \fIchars\fR is not specified then white space is removed (spaces, tabs, newlines, and carriage returns). .TP \fBstring trimleft \fIstring\fR ?\fIchars\fR? Returns a value equal to \fIstring\fR except that any leading characters from the set given by \fIchars\fR are removed. If \fIchars\fR is not specified then white space is removed (spaces, tabs, newlines, and carriage returns). .TP \fBstring trimright \fIstring\fR ?\fIchars\fR? Returns a value equal to \fIstring\fR except that any trailing characters from the set given by \fIchars\fR are removed. If \fIchars\fR is not specified then white space is removed (spaces, tabs, newlines, and carriage returns). .VS 8.1 .TP \fBstring wordend \fIstring charIndex\fR Returns the index of the character just after the last one in the word containing character \fIcharIndex\fR of \fIstring\fR. \fIcharIndex\fR may be specified as for the \fBindex\fR method. A word is considered to be any contiguous range of alphanumeric (Unicode letters or decimal digits) or underscore (Unicode connector punctuation) characters, or any single character other than these. .TP \fBstring wordstart \fIstring charIndex\fR Returns the index of the first character in the word containing character \fIcharIndex\fR of \fIstring\fR. \fIcharIndex\fR may be specified as for the \fBindex\fR method. A word is considered to be any contiguous range of alphanumeric (Unicode letters or decimal digits) or underscore (Unicode connector punctuation) characters, or any single character other than these. .VE 8.1 .SH KEYWORDS case conversion, compare, index, match, pattern, string, word