diff options
author | Fred Drake <fdrake@acm.org> | 2001-09-04 19:10:20 (GMT) |
---|---|---|
committer | Fred Drake <fdrake@acm.org> | 2001-09-04 19:10:20 (GMT) |
commit | b8f22749853cf79bfbe3709309e67d1a448f4cab (patch) | |
tree | 3c929ad149f67a924b63719913b3620b62e81388 /Lib/sre.py | |
parent | 6c0f20088f81d9d84124c265d58a2a8e7d47b059 (diff) | |
download | cpython-b8f22749853cf79bfbe3709309e67d1a448f4cab.zip cpython-b8f22749853cf79bfbe3709309e67d1a448f4cab.tar.gz cpython-b8f22749853cf79bfbe3709309e67d1a448f4cab.tar.bz2 |
Added docstrings by Neal Norwitz. This closes SF bug #450980.
Diffstat (limited to 'Lib/sre.py')
-rw-r--r-- | Lib/sre.py | 81 |
1 files changed, 81 insertions, 0 deletions
@@ -14,6 +14,87 @@ # other compatibility work. # +"""Support for regular expressions (RE). + +This module provides regular expression matching operations similar to +those found in Perl. It's 8-bit clean: the strings being processed may +contain both null bytes and characters whose high bit is set. Regular +expression pattern strings may not contain null bytes, but can specify +the null byte using the \\number notation. Characters with the high +bit set may be included. + +Regular expressions can contain both special and ordinary +characters. Most ordinary characters, like "A", "a", or "0", are the +simplest regular expressions; they simply match themselves. You can +concatenate ordinary characters, so last matches the string 'last'. + +The special characters are: + "." Matches any character except a newline. + "^" Matches the start of the string. + "$" Matches the end of the string. + "*" Matches 0 or more (greedy) repetitions of the preceding RE. + Greedy means that it will match as many repetitions as possible. + "+" Matches 1 or more (greedy) repetitions of the preceding RE. + "?" Matches 0 or 1 (greedy) of the preceding RE. + *?,+?,?? Non-greedy versions of the previous three special characters. + {m,n} Matches from m to n repetitions of the preceding RE. + {m,n}? Non-greedy version of the above. + "\\" Either escapes special characters or signals a special sequence. + [] Indicates a set of characters. + A "^" as the first character indicates a complementing set. + "|" A|B, creates an RE that will match either A or B. + (...) Matches the RE inside the parentheses. + The contents can be retrieved or matched later in the string. + (?iLmsx) Set the I, L, M, S, or X flag for the RE. + (?:...) Non-grouping version of regular parentheses. + (?P<name>...) The substring matched by the group is accessible by name. + (?P=name) Matches the text matched earlier by the group named name. + (?#...) A comment; ignored. + (?=...) Matches if ... matches next, but doesn't consume the string. + (?!...) Matches if ... doesn't match next. + +The special sequences consist of "\\" and a character from the list +below. If the ordinary character is not on the list, then the +resulting RE will match the second character. + \\number Matches the contents of the group of the same number. + \\A Matches only at the start of the string. + \\Z Matches only at the end of the string. + \\b Matches the empty string, but only at the start or end of a word. + \\B Matches the empty string, but not at the start or end of a word. + \\d Matches any decimal digit; equivalent to the set [0-9]. + \\D Matches any non-digit character; equivalent to the set [^0-9]. + \\s Matches any whitespace character; equivalent to [ \\t\\n\\r\\f\\v]. + \\S Matches any non-whitespace character; equiv. to [^ \\t\\n\\r\\f\\v]. + \\w Matches any alphanumeric character; equivalent to [a-zA-Z0-9_]. + With LOCALE, it will match the set [0-9_] plus characters defined + as letters for the current locale. + \\W Matches the complement of \\w. + \\\\ Matches a literal backslash. + +This module exports the following functions: + match Match a regular expression pattern to the beginning of a string. + search Search a string for the presence of a pattern. + sub Substitute occurrences of a pattern found in a string. + subn Same as sub, but also return the number of substitutions made. + split Split a string by the occurrences of a pattern. + findall Find all occurrences of a pattern in a string. + compile Compile a pattern into a RegexObject. + purge Clear the regular expression cache. + template Compile a template pattern, returning a pattern object. + escape Backslash all non-alphanumerics in a string. + +Some of the functions in this module takes flags as optional parameters: + I IGNORECASE Perform case-insensitive matching. + L LOCALE Make \w, \W, \b, \B, dependent on the current locale. + M MULTILINE "^" matches the beginning of lines as well as the string. + "$" matches the end of lines as well as the string. + S DOTALL "." matches any character at all, including the newline. + X VERBOSE Ignore whitespace and comments for nicer looking RE's. + U UNICODE Use unicode locale. + +This module also defines an exception 'error'. + +""" import sre_compile import sre_parse |