diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/StrMatch.3 | 22 | ||||
-rw-r--r-- | doc/string.n | 159 |
2 files changed, 127 insertions, 54 deletions
diff --git a/doc/StrMatch.3 b/doc/StrMatch.3 index 09cd6df..4d13379 100644 --- a/doc/StrMatch.3 +++ b/doc/StrMatch.3 @@ -5,25 +5,34 @@ '\" See the file "license.terms" for information on usage and redistribution '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. '\" -'\" RCS: @(#) $Id: StrMatch.3,v 1.2 1998/09/14 18:39:50 stanton Exp $ +'\" RCS: @(#) $Id: StrMatch.3,v 1.3 1999/05/22 01:20:11 stanton Exp $ '\" .so man.macros -.TH Tcl_StringMatch 3 "" Tcl "Tcl Library Procedures" +.TH Tcl_StringMatch 3 8.1 Tcl "Tcl Library Procedures" .BS .SH NAME -Tcl_StringMatch \- test whether a string matches a pattern +Tcl_StringMatch, Tcl_StringCaseMatch \- test whether a string matches a pattern .SH SYNOPSIS .nf \fB#include <tcl.h>\fR .sp int \fBTcl_StringMatch\fR(\fIstring\fR, \fIpattern\fR) +.VS 8.1 +.sp +\fBTcl_StringCaseMatch\fR(\fIstring, pattern, nocase\fR) +.VE 8.1 .SH ARGUMENTS .AP char *string in String to test. .AP char *pattern in Pattern to match against string. May contain special characters from the set *?\e[]. +.VS 8.1 +.AP int nocase in +Specifies whether the match should be done case-sensitive (0) or +case-insensitive (1). +.VE 8.1 .BE .SH DESCRIPTION @@ -34,6 +43,13 @@ a given pattern. If it does, then \fBTcl_StringMatch\fR returns used for matching is the same algorithm used in the ``string match'' Tcl command and is similar to the algorithm used by the C-shell for file name matching; see the Tcl manual entry for details. +.VS 8.1 +.PP +In \fBTcl_StringCaseMatch\fR, the algorithm is the same, but you have +the option to make the matching case-insensitive. If you choose this +(by passing \fBnocase\fR as 1), then the string and pattern are +essentially matched in the lower case. +.VE 8.1 .SH KEYWORDS match, pattern, string diff --git a/doc/string.n b/doc/string.n index 07ab0a7..6e61943 100644 --- a/doc/string.n +++ b/doc/string.n @@ -5,7 +5,7 @@ '\" See the file "license.terms" for information on usage and redistribution '\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. '\" -'\" RCS: @(#) $Id: string.n,v 1.9 1999/05/06 22:50:02 stanton Exp $ +'\" RCS: @(#) $Id: string.n,v 1.10 1999/05/22 01:20:11 stanton Exp $ '\" .so man.macros .TH string n 8.1 Tcl "Tcl Built-In Commands" @@ -25,35 +25,55 @@ The legal \fIoption\fRs (which may be abbreviated) are: .TP \fBstring bytelength \fIstring\fR Returns a decimal string giving the number of bytes used to represent -\fIstring\fR in memory. Because UTF-8 uses one to three bytes to +\fIstring\fR in memory. Because UTF\-8 uses one to three bytes to represent Unicode characters, the byte length will not be the same as the character length in general. The cases where a script cares about the byte length are rare. In almost all cases, you should use the -\fBstring length\fB operation. Refer to the \fBTcl_NumUtfChars\fR -manual entry for more details on the UTF-8 representation. +\fBstring length\fR operation. Refer to the \fBTcl_NumUtfChars\fR +manual entry for more details on the UTF\-8 representation. .TP -\fBstring compare ?\fB-nocase\fR? ?\fB-length int\fR? \fIstring1 string2\fR +\fBstring compare ?\fB\-nocase\fR? ?\fB\-length int\fR? \fIstring1 string2\fR .VE 8.1 Perform a character-by-character comparison of strings \fIstring1\fR and -\fIstring2\fR in the same way as the C \fBstrcmp\fR procedure. Return +\fIstring2\fR. Returns \-1, 0, or 1, depending on whether \fIstring1\fR is lexicographically less than, equal to, or greater than \fIstring2\fR. .VS 8.1 -If \fB-length\fR is specified, it works like C \fBstrncmp\fR, -comparing only to the specified length. If \fB-length\fR is negative, -it is ignored. If \fB-nocase\fR is specified, then the strings are +If \fB\-length\fR is specified, then only the first \fIlength\fR characters +are used in the comparison. If \fB\-length\fR is negative, it is +ignored. If \fB\-nocase\fR is specified, then the strings are compared in a case-insensitive manner. .TP -\fBstring equal ?\fB-nocase\fR? ?\fB-length int\fR? \fIstring1 string2\fR -.VE 8.1 -Like the \fBcompare\fR method, but returns 1 when the strings -are equal, or 0 when not. +\fBstring equal ?\fB\-nocase\fR? ?\fB-length int\fR? \fIstring1 +string2\fR Perform a character-by-character comparison of strings +\fIstring1\fR and \fIstring2\fR. Returns 1 if \fIstring1\fR and +\fIstring2\fR are identical, or 0 when not. If \fB\-length\fR is +specified, then only the first \fIlength\fR characters are used in the +comparison. If \fB\-length\fR is negative, it is ignored. If +\fB\-nocase\fR is specified, then the strings are compared in a +case-insensitive manner. .TP -\fBstring first \fIstring1 string2\fR +\fBstring first \fIstring1 string2\fR ?\fIstartIndex\fR? +.VE 8.1 Search \fIstring2\fR for a sequence of characters that exactly match the characters in \fIstring1\fR. If found, return the index of the first character in the first such match within \fIstring2\fR. If not found, return \-1. +.VS 8.1 +If \fIstartIndex\fR is specified (in any of the forms accepted by the +\fBindex\fR method), then the search is constrained to start with the +character in \fIstring2\fR specified by the index. For example, +.RS +.CS +\fBstring first a 0a23456789abcdef 5\fR +.CE +will return \fB10\fR, but +.CS +\fBstring first a 0123456789abcdef 11\fR +.CE +will return \fB\-1\fR. +.RE +.VE 8.1 .TP \fBstring index \fIstring charIndex\fR Returns the \fIcharIndex\fR'th character of the \fIstring\fR @@ -67,9 +87,9 @@ follows: The char specified at this integral index .IP \fBend\fR 10 The last char of the string. -.IP \fBend-\fIinteger\fR 10 +.IP \fBend\-\fIinteger\fR 10 The last char of the string minus the specified integer -offset (e.g. \fBend-1\fR would refer to the "c" in "abcd"). +offset (e.g. \fBend\-1\fR would refer to the "c" in "abcd"). .PP .VE 8.1 If \fIcharIndex\fR is less than 0 or greater than @@ -78,14 +98,15 @@ returned. .RE .VS 8.1 .TP -\fBstring is \fIclass\fR ?\fB-strict\fR? ?\fB-failindex \fIvarname\fR? \fIstring\fR -See if \fIstring\fR is a valid form of the specified class. If -\fB-strict\fR is specified, then an empty string returns 0, otherwise and -empty string will return 1 on any class. If \fB-failindex\fR is specified, -then if the function returns 0, the index in the string where the class was -no longer valid will be stored in the variable named \fIvarname\fR. The -\fIvarname\fR will not be set if the function returns 1. The following -class definitions are allowed (the class name can be abbreviated): +\fBstring is \fIclass\fR ?\fB\-strict\fR? ?\fB\-failindex \fIvarname\fR? \fIstring\fR +Returns 1 if \fIstring\fR is a valid member of the specified character +class, otherwise returns 0. If \fB\-strict\fR is specified, then an +empty string returns 0, otherwise and empty string will return 1 on +any class. If \fB\-failindex\fR is specified, then if the function +returns 0, the index in the string where the class was no longer valid +will be stored in the variable named \fIvarname\fR. The \fIvarname\fR +will not be set if the function returns 1. The following character classes +are recognized (the class name can be abbreviated): .RS .IP \fBalnum\fR 10 Any Unicode alphabet or digit character. @@ -93,43 +114,68 @@ Any Unicode alphabet or digit character. Any Unicode alphabet character. .IP \fBascii\fR 10 Any character with a value less than \\u0080 (those that -are in the 7-bit ascii range). +are in the 7\-bit ascii range). .IP \fBboolean\fR 10 -Any of the forms allowed to Tcl_GetBoolean. +Any of the forms allowed to \fBTcl_GetBoolean\fR. +.IP \fBcontrol\fR 10 +Any Unicode control character. .IP \fBdigit\fR 10 Any Unicode digit character. .IP \fBdouble\fR 10 Any of the valid forms for a double in Tcl, with optional surrounding whitespace. In case of under/overflow in the value, 0 is returned -and the \fIvarname\fR will contain -1. +and the \fIvarname\fR will contain \-1. .IP \fBfalse\fR 10 -Any of the forms allowed to Tcl_GetBoolean where the value is false. +Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is false. +.IP \fBgraph\fR 10 +Any Unicode printing character, except space. .IP \fBinteger\fR 10 Any of the valid forms for an integer in Tcl, with optional surrounding whitespace. In case of under/overflow in the value, 0 is returned -and the \fIvarname\fR will contain -1. +and the \fIvarname\fR will contain \-1. .IP \fBlower\fR 10 Any Unicode lower case alphabet character. +.IP \fBprint\fR 10 +Any Unicode printing character, including space. +.IP \fBpunct\fR 10 +Any Unicode printing character, except space or where \fBalnum\fR is true. .IP \fBspace\fR 10 Any Unicode space character. .IP \fBtrue\fR 10 -Any of the forms allowed to Tcl_GetBoolean where the value is true. +Any of the forms allowed to \fBTcl_GetBoolean\fR where the value is true. .IP \fBupper\fR 10 Any upper case alphabet character in the Unicode character set. .IP \fBwordchar\fR 10 Any Unicode word character. That is any alphanumeric character, -and any Unicode connector punctuation characters (ie: underscore). +and any Unicode connector punctuation characters (e.g. underscore). +.IP \fBxdigit\fR 10 +Any hexadecimal digit character ([0\-9A\-Fa\-f]). .RE In the case of \fBboolean\fR, \fBtrue\fR and \fBfalse\fR, if the function will return 0, the \fIvarname\fR will always be set to 0, due to the varied nature of a valid boolean value. -.VE 8.1 .TP -\fBstring last \fIstring1 string2\fR +\fBstring last \fIstring1 string2\fR ?\fIstartIndex\fR? +.VE 8.1 Search \fIstring2\fR for a sequence of characters that exactly match the characters in \fIstring1\fR. If found, return the index of the first character in the last such match within \fIstring2\fR. If there is no match, then return \-1. +.VS 8.1 +If \fIstartIndex\fR is specified (in any of the forms accepted by the +\fBindex\fR method), then only the characters in \fIstring2\fR at or before the +specified \fIstartIndex\fR will be considered by the search. For example, +.RS +.CS +\fBstring last a 0a23456789abcdef 15\fR +.CE +will return \fB10\fR, but +.CS +\fBstring last a 0a23456789abcdef 9\fR +.CE +will return \fB1\fR. +.RE +.VE 8.1 .TP \fBstring length \fIstring\fR Returns a decimal string giving the number of characters in @@ -137,29 +183,33 @@ Returns a decimal string giving the number of characters in number of bytes used to store the string. .VS 8.1 .TP -\fBstring map ?\fB-nocase\fR? \fIcharMap string\fR +\fBstring map\fR ?\fB\-nocase\fR? \fIcharMap string\fR Replaces characters in \fIstring\fR based on the key-value pairs in -\fIcharMap\fR. \fIcharMap\fR is a list of key value key value ... as -in the form returned by \fBarray get\fR. Each instance of a key in -the string will be replace with its corresponding value. If -\fB-nocase\fR is specified, then matching is done without regard to -case differences. Both key and value may be multiple characters. This -is done in an ordered manner, so the key appearing first in the list -will be checked first, and so on. \fIstring\fR is only iterated over -once, so earlier key replacements will have no affect for later key -matches. For example, +\fIcharMap\fR. \fIcharMap\fR is a list of \fIkey value key value\fR ... +as in the form returned by \fBarray get\fR. Each instance of a +key in the string will be replaced with its corresponding value. If +\fB\-nocase\fR is specified, then matching is done without regard to +case differences. Both \fIkey\fR and \fIvalue\fR may be multiple +characters. Replacement is done in an ordered manner, so the key appearing +first in the list will be checked first, and so on. \fIstring\fR is +only iterated over once, so earlier key replacements will have no +affect for later key matches. For example, .RS .CS \fBstring map {abc 1 ab 2 a 3 1 0} 1abcaababcabababc\fR .CE will return the string \fB01321221\fR. .RE -.VE 8.1 .TP -\fBstring match \fIpattern\fR \fIstring\fR +\fBstring match ?\fB\-nocase\fR? \fIpattern\fR \fIstring\fR +.VE 8.1 See if \fIpattern\fR matches \fIstring\fR; return 1 if it does, 0 -if it doesn't. Matching is done in a fashion similar to that -used by the C-shell. For the two strings to match, their contents +if it doesn't. +.VS 8.1 +If \fB\-nocase\fR is specified, then the pattern attempts to match +against the string in a case insensitive manner. +.VE 8.1 +For the two strings to match, their contents must be identical except that the following special sequences may appear in \fIpattern\fR: .RS @@ -173,6 +223,13 @@ Matches any character in the set given by \fIchars\fR. If a sequence of the form \fIx\fB\-\fIy\fR appears in \fIchars\fR, then any character between \fIx\fR and \fIy\fR, inclusive, will match. +.VS 8.1 +When used with \fB\-nocase\fR, the end points of the range are converted +to lower case first. Whereas {[A\-z]} matches '_' when matching +case-sensitively ('_' falls between the 'Z' and 'a'), with \fB\-nocase\fR +this is considered like {[A\-Za\-z]} (and probably what was meant in the +first place). +.VE 8.1 .IP \fB\e\fIx\fR 10 Matches the single character \fIx\fR. This provides a way of avoiding the special interpretation of the characters @@ -196,12 +253,12 @@ it is treated as if it were \fBend\fR. If \fIfirst\fR is greater than \fBstring repeat \fIstring count\fR Returns \fIstring\fR repeated \fIcount\fR number of times. .TP -\fBstring replace \fIstring last\fR ?\fIstring\fR? +\fBstring replace \fIstring first last\fR ?\fInewstring\fR? Removes a range of consecutive characters from \fIstring\fR, starting with the character whose index is \fIfirst\fR and ending with the character whose index is \fIlast\fR. An index of 0 refers to the -first character of the string. \fIfirst\fR and \fIlast\fR may be -specified as for the \fBindex\fR method. If \fIstring\fR is +first character of the string. \fIFirst\fR and \fIlast\fR may be +specified as for the \fBindex\fR method. If \fInewstring\fR is specified, then it is placed in the removed character range. If \fIfirst\fR is less than zero then it is treated as if it were zero, and if \fIlast\fR is greater than or equal to the length of the string then @@ -276,4 +333,4 @@ single character other than these. .VE 8.1 .SH KEYWORDS -case conversion, compare, index, match, pattern, string, word +case conversion, compare, index, match, pattern, string, word, equal, ctype |