summaryrefslogtreecommitdiffstats
path: root/Doc/howto/regex.rst
diff options
context:
space:
mode:
authorChristian Heimes <christian@cheimes.de>2007-12-31 16:14:33 (GMT)
committerChristian Heimes <christian@cheimes.de>2007-12-31 16:14:33 (GMT)
commit5b5e81c637eb115b27b4c5c66cf1cf348c706162 (patch)
treee83d0ce68e92750e40fbb901a0659bade6f41674 /Doc/howto/regex.rst
parent862543aa85249b46649b60da96743b4b14c6c83b (diff)
downloadcpython-5b5e81c637eb115b27b4c5c66cf1cf348c706162.zip
cpython-5b5e81c637eb115b27b4c5c66cf1cf348c706162.tar.gz
cpython-5b5e81c637eb115b27b4c5c66cf1cf348c706162.tar.bz2
Merged revisions 59605-59624 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk ........ r59606 | georg.brandl | 2007-12-29 11:57:00 +0100 (Sat, 29 Dec 2007) | 2 lines Some cleanup in the docs. ........ r59611 | martin.v.loewis | 2007-12-29 19:49:21 +0100 (Sat, 29 Dec 2007) | 2 lines Bug #1699: Define _BSD_SOURCE only on OpenBSD. ........ r59612 | raymond.hettinger | 2007-12-29 23:09:34 +0100 (Sat, 29 Dec 2007) | 1 line Simpler documentation for itertools.tee(). Should be backported. ........ r59613 | raymond.hettinger | 2007-12-29 23:16:24 +0100 (Sat, 29 Dec 2007) | 1 line Improve docs for itertools.groupby(). The use of xrange(0) to create a unique object is less obvious than object(). ........ r59620 | christian.heimes | 2007-12-31 15:47:07 +0100 (Mon, 31 Dec 2007) | 3 lines Added wininst-9.0.exe executable for VS 2008 Integrated bdist_wininst into PCBuild9 directory ........ r59621 | christian.heimes | 2007-12-31 15:51:18 +0100 (Mon, 31 Dec 2007) | 1 line Moved PCbuild directory to PC/VS7.1 ........ r59622 | christian.heimes | 2007-12-31 15:59:26 +0100 (Mon, 31 Dec 2007) | 1 line Fix paths for build bot ........ r59623 | christian.heimes | 2007-12-31 16:02:41 +0100 (Mon, 31 Dec 2007) | 1 line Fix paths for build bot, part 2 ........ r59624 | christian.heimes | 2007-12-31 16:18:55 +0100 (Mon, 31 Dec 2007) | 1 line Renamed PCBuild9 directory to PCBuild ........
Diffstat (limited to 'Doc/howto/regex.rst')
-rw-r--r--Doc/howto/regex.rst34
1 files changed, 8 insertions, 26 deletions
diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst
index 783bec1..6adecd7 100644
--- a/Doc/howto/regex.rst
+++ b/Doc/howto/regex.rst
@@ -5,11 +5,11 @@
:Author: A.M. Kuchling
:Release: 0.05
-.. % TODO:
-.. % Document lookbehind assertions
-.. % Better way of displaying a RE, a string, and what it matches
-.. % Mention optional argument to match.groups()
-.. % Unicode (at least a reference)
+.. TODO:
+ Document lookbehind assertions
+ Better way of displaying a RE, a string, and what it matches
+ Mention optional argument to match.groups()
+ Unicode (at least a reference)
.. topic:: Abstract
@@ -91,8 +91,6 @@ is the same as ``[a-c]``, which uses a range to express the same set of
characters. If you wanted to match only lowercase letters, your RE would be
``[a-z]``.
-.. % $
-
Metacharacters are not active inside classes. For example, ``[akm$]`` will
match any of the characters ``'a'``, ``'k'``, ``'m'``, or ``'$'``; ``'$'`` is
usually a metacharacter, but inside a character class it's stripped of its
@@ -679,8 +677,8 @@ given location, they can obviously be matched an infinite number of times.
>>> print(re.search('^From', 'Reciting From Memory'))
None
- .. % To match a literal \character{\^}, use \regexp{\e\^} or enclose it
- .. % inside a character class, as in \regexp{[{\e}\^]}.
+ .. To match a literal \character{\^}, use \regexp{\e\^} or enclose it
+ .. inside a character class, as in \regexp{[{\e}\^]}.
``$``
Matches at the end of a line, which is defined as either the end of the string,
@@ -696,8 +694,6 @@ given location, they can obviously be matched an infinite number of times.
To match a literal ``'$'``, use ``\$`` or enclose it inside a character class,
as in ``[$]``.
- .. % $
-
``\A``
Matches only at the start of the string. When not in :const:`MULTILINE` mode,
``\A`` and ``^`` are effectively the same. In :const:`MULTILINE` mode, they're
@@ -980,12 +976,8 @@ filenames where the extension is not ``bat``? Some incorrect attempts:
that the first character of the extension is not a ``b``. This is wrong,
because the pattern also doesn't match ``foo.bar``.
-.. % $
-
``.*[.]([^b]..|.[^a].|..[^t])$``
-.. % Messes up the HTML without the curly braces around \^
-
The expression gets messier when you try to patch up the first solution by
requiring one of the following cases to match: the first character of the
extension isn't ``b``; the second character isn't ``a``; or the third character
@@ -1013,16 +1005,12 @@ match, the whole pattern will fail. The trailing ``$`` is required to ensure
that something like ``sample.batch``, where the extension only starts with
``bat``, will be allowed.
-.. % $
-
Excluding another filename extension is now easy; simply add it as an
alternative inside the assertion. The following pattern excludes filenames that
end in either ``bat`` or ``exe``:
``.*[.](?!bat$|exe$).*$``
-.. % $
-
Modifying Strings
=================
@@ -1343,16 +1331,10 @@ enables REs to be formatted more neatly::
\s*$ # Trailing whitespace to end-of-line
""", re.VERBOSE)
-This is far more readable than:
-
-.. % $
-
-::
+This is far more readable than::
pat = re.compile(r"\s*(?P<header>[^:]+)\s*:(?P<value>.*?)\s*$")
-.. % $
-
Feedback
========