diff options
author | William Joye <wjoye@cfa.harvard.edu> | 2016-12-21 22:56:22 (GMT) |
---|---|---|
committer | William Joye <wjoye@cfa.harvard.edu> | 2016-12-21 22:56:22 (GMT) |
commit | d1a6de55efc90f190dee42ab8c4fa9070834e77d (patch) | |
tree | ec633f5608ef498bee52a5f42c12c49493ec8bf8 /tcl8.6/doc/regexp.n | |
parent | 5514e37335c012cc70f5b9aee3cedfe3d57f583f (diff) | |
parent | 98acd3f494b28ddd8c345a2bb9311e41e2d56ddd (diff) | |
download | blt-d1a6de55efc90f190dee42ab8c4fa9070834e77d.zip blt-d1a6de55efc90f190dee42ab8c4fa9070834e77d.tar.gz blt-d1a6de55efc90f190dee42ab8c4fa9070834e77d.tar.bz2 |
Merge commit '98acd3f494b28ddd8c345a2bb9311e41e2d56ddd' as 'tcl8.6'
Diffstat (limited to 'tcl8.6/doc/regexp.n')
-rw-r--r-- | tcl8.6/doc/regexp.n | 208 |
1 files changed, 208 insertions, 0 deletions
diff --git a/tcl8.6/doc/regexp.n b/tcl8.6/doc/regexp.n new file mode 100644 index 0000000..6f303a4 --- /dev/null +++ b/tcl8.6/doc/regexp.n @@ -0,0 +1,208 @@ +'\" +'\" Copyright (c) 1998 Sun Microsystems, Inc. +'\" +'\" See the file "license.terms" for information on usage and redistribution +'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. +'\" +.TH regexp n 8.3 Tcl "Tcl Built-In Commands" +.so man.macros +.BS +'\" Note: do not modify the .SH NAME line immediately below! +.SH NAME +regexp \- Match a regular expression against a string +.SH SYNOPSIS +\fBregexp \fR?\fIswitches\fR? \fIexp string \fR?\fImatchVar\fR? ?\fIsubMatchVar subMatchVar ...\fR? +.BE +.SH DESCRIPTION +.PP +Determines whether the regular expression \fIexp\fR matches part or +all of \fIstring\fR and returns 1 if it does, 0 if it does not, unless +\fB\-inline\fR is specified (see below). +(Regular expression matching is described in the \fBre_syntax\fR +reference page.) +.PP +If additional arguments are specified after \fIstring\fR then they +are treated as the names of variables in which to return +information about which part(s) of \fIstring\fR matched \fIexp\fR. +\fIMatchVar\fR will be set to the range of \fIstring\fR that +matched all of \fIexp\fR. The first \fIsubMatchVar\fR will contain +the characters in \fIstring\fR that matched the leftmost parenthesized +subexpression within \fIexp\fR, the next \fIsubMatchVar\fR will +contain the characters that matched the next parenthesized +subexpression to the right in \fIexp\fR, and so on. +.PP +If the initial arguments to \fBregexp\fR start with \fB\-\fR then +they are treated as switches. The following switches are +currently supported: +.TP 15 +\fB\-about\fR +. +Instead of attempting to match the regular expression, returns a list +containing information about the regular expression. The first +element of the list is a subexpression count. The second element is a +list of property names that describe various attributes of the regular +expression. This switch is primarily intended for debugging purposes. +.TP 15 +\fB\-expanded\fR +. +Enables use of the expanded regular expression syntax where +whitespace and comments are ignored. This is the same as specifying +the \fB(?x)\fR embedded option (see the \fBre_syntax\fR manual page). +.TP 15 +\fB\-indices\fR +. +Changes what is stored in the \fImatchVar\fR and \fIsubMatchVar\fRs. +Instead of storing the matching characters from \fIstring\fR, +each variable +will contain a list of two decimal strings giving the indices +in \fIstring\fR of the first and last characters in the matching +range of characters. +.TP 15 +\fB\-line\fR +. +Enables newline-sensitive matching. By default, newline is a +completely ordinary character with no special meaning. With this +flag, +.QW [^ +bracket expressions and +.QW . +never match newline, +.QW ^ +matches an empty string after any newline in addition to its normal +function, and +.QW $ +matches an empty string before any newline in +addition to its normal function. This flag is equivalent to +specifying both \fB\-linestop\fR and \fB\-lineanchor\fR, or the +\fB(?n)\fR embedded option (see the \fBre_syntax\fR manual page). +.TP 15 +\fB\-linestop\fR +. +Changes the behavior of +.QW [^ +bracket expressions and +.QW . +so that they +stop at newlines. This is the same as specifying the \fB(?p)\fR +embedded option (see the \fBre_syntax\fR manual page). +.TP 15 +\fB\-lineanchor\fR +. +Changes the behavior of +.QW ^ +and +.QW $ +(the +.QW anchors ) +so they match the +beginning and end of a line respectively. This is the same as +specifying the \fB(?w)\fR embedded option (see the \fBre_syntax\fR +manual page). +.TP 15 +\fB\-nocase\fR +. +Causes upper-case characters in \fIstring\fR to be treated as +lower case during the matching process. +.TP 15 +\fB\-all\fR +. +Causes the regular expression to be matched as many times as possible +in the string, returning the total number of matches found. If this +is specified with match variables, they will contain information for +the last match only. +.TP 15 +\fB\-inline\fR +. +Causes the command to return, as a list, the data that would otherwise +be placed in match variables. When using \fB\-inline\fR, +match variables may not be specified. If used with \fB\-all\fR, the +list will be concatenated at each iteration, such that a flat list is +always returned. For each match iteration, the command will append the +overall match data, plus one element for each subexpression in the +regular expression. Examples are: +.RS +.PP +.CS +\fBregexp\fR -inline -- {\ew(\ew)} " inlined " + \fI\(-> in n\fR +\fBregexp\fR -all -inline -- {\ew(\ew)} " inlined " + \fI\(-> in n li i ne e\fR +.CE +.RE +.TP 15 +\fB\-start\fR \fIindex\fR +. +Specifies a character index offset into the string to start +matching the regular expression at. +The \fIindex\fR value is interpreted in the same manner +as the \fIindex\fR argument to \fBstring index\fR. +When using this switch, +.QW ^ +will not match the beginning of the line, and \eA will still +match the start of the string at \fIindex\fR. If \fB\-indices\fR +is specified, the indices will be indexed starting from the +absolute beginning of the input string. +\fIindex\fR will be constrained to the bounds of the input string. +.TP 15 +\fB\-\|\-\fR +. +Marks the end of switches. The argument following this one will +be treated as \fIexp\fR even if it starts with a \fB\-\fR. +.PP +If there are more \fIsubMatchVar\fRs than parenthesized +subexpressions within \fIexp\fR, or if a particular subexpression +in \fIexp\fR does not match the string (e.g. because it was in a +portion of the expression that was not matched), then the corresponding +\fIsubMatchVar\fR will be set to +.QW "\fB\-1 \-1\fR" +if \fB\-indices\fR has been specified or to an empty string otherwise. +.SH EXAMPLES +.PP +Find the first occurrence of a word starting with \fBfoo\fR in a +string that is not actually an instance of \fBfoobar\fR, and get the +letters following it up to the end of the word into a variable: +.PP +.CS +\fBregexp\fR {\emfoo(?!bar\eM)(\ew*)} $string \-> restOfWord +.CE +.PP +Note that the whole matched substring has been placed in the variable +.QW \fB\->\fR , +which is a name chosen to look nice given that we are not +actually interested in its contents. +.PP +Find the index of the word \fBbadger\fR (in any case) within a string +and store that in the variable \fBlocation\fR: +.PP +.CS +\fBregexp\fR \-indices {(?i)\embadger\eM} $string location +.CE +.PP +This could also be written as a \fIbasic\fR regular expression (as opposed +to using the default syntax of \fIadvanced\fR regular expressions) match by +prefixing the expression with a suitable flag: +.PP +.CS +\fBregexp\fR \-indices {(?ib)\e<badger\e>} $string location +.CE +.PP +This counts the number of octal digits in a string: +.PP +.CS +\fBregexp\fR \-all {[0\-7]} $string +.CE +.PP +This lists all words (consisting of all sequences of non-whitespace +characters) in a string, and is useful as a more powerful version of the +\fBsplit\fR command: +.PP +.CS +\fBregexp\fR \-all \-inline {\eS+} $string +.CE +.SH "SEE ALSO" +re_syntax(n), regsub(n), string(n) +.SH KEYWORDS +match, parsing, pattern, regular expression, splitting, string +'\" Local Variables: +'\" mode: nroff +'\" End: |