diff options
Diffstat (limited to 'Doc/howto/regex.rst')
-rw-r--r-- | Doc/howto/regex.rst | 34 |
1 files changed, 8 insertions, 26 deletions
diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index 783bec1..6adecd7 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -5,11 +5,11 @@ :Author: A.M. Kuchling :Release: 0.05 -.. % TODO: -.. % Document lookbehind assertions -.. % Better way of displaying a RE, a string, and what it matches -.. % Mention optional argument to match.groups() -.. % Unicode (at least a reference) +.. TODO: + Document lookbehind assertions + Better way of displaying a RE, a string, and what it matches + Mention optional argument to match.groups() + Unicode (at least a reference) .. topic:: Abstract @@ -91,8 +91,6 @@ is the same as ``[a-c]``, which uses a range to express the same set of characters. If you wanted to match only lowercase letters, your RE would be ``[a-z]``. -.. % $ - Metacharacters are not active inside classes. For example, ``[akm$]`` will match any of the characters ``'a'``, ``'k'``, ``'m'``, or ``'$'``; ``'$'`` is usually a metacharacter, but inside a character class it's stripped of its @@ -679,8 +677,8 @@ given location, they can obviously be matched an infinite number of times. >>> print(re.search('^From', 'Reciting From Memory')) None - .. % To match a literal \character{\^}, use \regexp{\e\^} or enclose it - .. % inside a character class, as in \regexp{[{\e}\^]}. + .. To match a literal \character{\^}, use \regexp{\e\^} or enclose it + .. inside a character class, as in \regexp{[{\e}\^]}. ``$`` Matches at the end of a line, which is defined as either the end of the string, @@ -696,8 +694,6 @@ given location, they can obviously be matched an infinite number of times. To match a literal ``'$'``, use ``\$`` or enclose it inside a character class, as in ``[$]``. - .. % $ - ``\A`` Matches only at the start of the string. When not in :const:`MULTILINE` mode, ``\A`` and ``^`` are effectively the same. In :const:`MULTILINE` mode, they're @@ -980,12 +976,8 @@ filenames where the extension is not ``bat``? Some incorrect attempts: that the first character of the extension is not a ``b``. This is wrong, because the pattern also doesn't match ``foo.bar``. -.. % $ - ``.*[.]([^b]..|.[^a].|..[^t])$`` -.. % Messes up the HTML without the curly braces around \^ - The expression gets messier when you try to patch up the first solution by requiring one of the following cases to match: the first character of the extension isn't ``b``; the second character isn't ``a``; or the third character @@ -1013,16 +1005,12 @@ match, the whole pattern will fail. The trailing ``$`` is required to ensure that something like ``sample.batch``, where the extension only starts with ``bat``, will be allowed. -.. % $ - Excluding another filename extension is now easy; simply add it as an alternative inside the assertion. The following pattern excludes filenames that end in either ``bat`` or ``exe``: ``.*[.](?!bat$|exe$).*$`` -.. % $ - Modifying Strings ================= @@ -1343,16 +1331,10 @@ enables REs to be formatted more neatly:: \s*$ # Trailing whitespace to end-of-line """, re.VERBOSE) -This is far more readable than: - -.. % $ - -:: +This is far more readable than:: pat = re.compile(r"\s*(?P<header>[^:]+)\s*:(?P<value>.*?)\s*$") -.. % $ - Feedback ======== |