summaryrefslogtreecommitdiffstats
path: root/Lib/sre.py
diff options
context:
space:
mode:
authorFred Drake <fdrake@acm.org>2001-09-04 19:10:20 (GMT)
committerFred Drake <fdrake@acm.org>2001-09-04 19:10:20 (GMT)
commitb8f22749853cf79bfbe3709309e67d1a448f4cab (patch)
tree3c929ad149f67a924b63719913b3620b62e81388 /Lib/sre.py
parent6c0f20088f81d9d84124c265d58a2a8e7d47b059 (diff)
downloadcpython-b8f22749853cf79bfbe3709309e67d1a448f4cab.zip
cpython-b8f22749853cf79bfbe3709309e67d1a448f4cab.tar.gz
cpython-b8f22749853cf79bfbe3709309e67d1a448f4cab.tar.bz2
Added docstrings by Neal Norwitz. This closes SF bug #450980.
Diffstat (limited to 'Lib/sre.py')
-rw-r--r--Lib/sre.py81
1 files changed, 81 insertions, 0 deletions
diff --git a/Lib/sre.py b/Lib/sre.py
index 7b79f43..addd106 100644
--- a/Lib/sre.py
+++ b/Lib/sre.py
@@ -14,6 +14,87 @@
# other compatibility work.
#
+"""Support for regular expressions (RE).
+
+This module provides regular expression matching operations similar to
+those found in Perl. It's 8-bit clean: the strings being processed may
+contain both null bytes and characters whose high bit is set. Regular
+expression pattern strings may not contain null bytes, but can specify
+the null byte using the \\number notation. Characters with the high
+bit set may be included.
+
+Regular expressions can contain both special and ordinary
+characters. Most ordinary characters, like "A", "a", or "0", are the
+simplest regular expressions; they simply match themselves. You can
+concatenate ordinary characters, so last matches the string 'last'.
+
+The special characters are:
+ "." Matches any character except a newline.
+ "^" Matches the start of the string.
+ "$" Matches the end of the string.
+ "*" Matches 0 or more (greedy) repetitions of the preceding RE.
+ Greedy means that it will match as many repetitions as possible.
+ "+" Matches 1 or more (greedy) repetitions of the preceding RE.
+ "?" Matches 0 or 1 (greedy) of the preceding RE.
+ *?,+?,?? Non-greedy versions of the previous three special characters.
+ {m,n} Matches from m to n repetitions of the preceding RE.
+ {m,n}? Non-greedy version of the above.
+ "\\" Either escapes special characters or signals a special sequence.
+ [] Indicates a set of characters.
+ A "^" as the first character indicates a complementing set.
+ "|" A|B, creates an RE that will match either A or B.
+ (...) Matches the RE inside the parentheses.
+ The contents can be retrieved or matched later in the string.
+ (?iLmsx) Set the I, L, M, S, or X flag for the RE.
+ (?:...) Non-grouping version of regular parentheses.
+ (?P<name>...) The substring matched by the group is accessible by name.
+ (?P=name) Matches the text matched earlier by the group named name.
+ (?#...) A comment; ignored.
+ (?=...) Matches if ... matches next, but doesn't consume the string.
+ (?!...) Matches if ... doesn't match next.
+
+The special sequences consist of "\\" and a character from the list
+below. If the ordinary character is not on the list, then the
+resulting RE will match the second character.
+ \\number Matches the contents of the group of the same number.
+ \\A Matches only at the start of the string.
+ \\Z Matches only at the end of the string.
+ \\b Matches the empty string, but only at the start or end of a word.
+ \\B Matches the empty string, but not at the start or end of a word.
+ \\d Matches any decimal digit; equivalent to the set [0-9].
+ \\D Matches any non-digit character; equivalent to the set [^0-9].
+ \\s Matches any whitespace character; equivalent to [ \\t\\n\\r\\f\\v].
+ \\S Matches any non-whitespace character; equiv. to [^ \\t\\n\\r\\f\\v].
+ \\w Matches any alphanumeric character; equivalent to [a-zA-Z0-9_].
+ With LOCALE, it will match the set [0-9_] plus characters defined
+ as letters for the current locale.
+ \\W Matches the complement of \\w.
+ \\\\ Matches a literal backslash.
+
+This module exports the following functions:
+ match Match a regular expression pattern to the beginning of a string.
+ search Search a string for the presence of a pattern.
+ sub Substitute occurrences of a pattern found in a string.
+ subn Same as sub, but also return the number of substitutions made.
+ split Split a string by the occurrences of a pattern.
+ findall Find all occurrences of a pattern in a string.
+ compile Compile a pattern into a RegexObject.
+ purge Clear the regular expression cache.
+ template Compile a template pattern, returning a pattern object.
+ escape Backslash all non-alphanumerics in a string.
+
+Some of the functions in this module takes flags as optional parameters:
+ I IGNORECASE Perform case-insensitive matching.
+ L LOCALE Make \w, \W, \b, \B, dependent on the current locale.
+ M MULTILINE "^" matches the beginning of lines as well as the string.
+ "$" matches the end of lines as well as the string.
+ S DOTALL "." matches any character at all, including the newline.
+ X VERBOSE Ignore whitespace and comments for nicer looking RE's.
+ U UNICODE Use unicode locale.
+
+This module also defines an exception 'error'.
+
+"""
import sre_compile
import sre_parse