summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorEzio Melotti <ezio.melotti@gmail.com>2012-02-29 09:48:44 (GMT)
committerEzio Melotti <ezio.melotti@gmail.com>2012-02-29 09:48:44 (GMT)
commit5a045b9f5493b12bc8421b55ffff10b6572bc22c (patch)
treea9e2e7a32c005be32b1a9b4af4e8f31d4a9254ee /Doc
parent62417a035484c38fd8b674f58e193c9eb40bea79 (diff)
downloadcpython-5a045b9f5493b12bc8421b55ffff10b6572bc22c.zip
cpython-5a045b9f5493b12bc8421b55ffff10b6572bc22c.tar.gz
cpython-5a045b9f5493b12bc8421b55ffff10b6572bc22c.tar.bz2
#10713: Improve documentation for \b and \B and add a few tests. Initial patch and tests by Martin Pool.
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/re.rst22
1 files changed, 14 insertions, 8 deletions
diff --git a/Doc/library/re.rst b/Doc/library/re.rst
index b196a28..ac07cf8 100644
--- a/Doc/library/re.rst
+++ b/Doc/library/re.rst
@@ -330,16 +330,22 @@ the second character. For example, ``\$`` matches the character ``'$'``.
Matches the empty string, but only at the beginning or end of a word.
A word is defined as a sequence of Unicode alphanumeric or underscore
characters, so the end of a word is indicated by whitespace or a
- non-alphanumeric, non-underscore Unicode character. Note that
- formally, ``\b`` is defined as the boundary between a ``\w`` and a
- ``\W`` character (or vice versa). By default Unicode alphanumerics
- are the ones used, but this can be changed by using the :const:`ASCII`
- flag. Inside a character range, ``\b`` represents the backspace
- character, for compatibility with Python's string literals.
+ non-alphanumeric, non-underscore Unicode character. Note that formally,
+ ``\b`` is defined as the boundary between a ``\w`` and a ``\W`` character
+ (or vice versa), or between ``\w`` and the beginning/end of the string.
+ This means that ``r'\bfoo\b'`` matches ``'foo'``, ``'foo.'``, ``'(foo)'``,
+ ``'bar foo baz'`` but not ``'foobar'`` or ``'foo3'``.
+
+ By default Unicode alphanumerics are the ones used, but this can be changed
+ by using the :const:`ASCII` flag. Inside a character range, ``\b``
+ represents the backspace character, for compatibility with Python's string
+ literals.
``\B``
- Matches the empty string, but only when it is *not* at the beginning or end of a
- word. This is just the opposite of ``\b``, so word characters are
+ Matches the empty string, but only when it is *not* at the beginning or end
+ of a word. This means that ``r'py\B'`` matches ``'python'``, ``'py3'``,
+ ``'py2'``, but not ``'py'``, ``'py.'``, or ``'py!'``.
+ ``\B`` is just the opposite of ``\b``, so word characters are
Unicode alphanumerics or the underscore, although this can be changed
by using the :const:`ASCII` flag.