summaryrefslogtreecommitdiffstats
path: root/Doc/howto
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/howto')
-rw-r--r--Doc/howto/regex.rst27
-rw-r--r--Doc/howto/unicode.rst2
2 files changed, 22 insertions, 7 deletions
diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst
index eef6347..a3a6553 100644
--- a/Doc/howto/regex.rst
+++ b/Doc/howto/regex.rst
@@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
disadvantage which is the topic of the next section.
+.. _the-backslash-plague:
+
The Backslash Plague
--------------------
@@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
while ``"\n"`` is a one-character string containing a newline. Regular
expressions will often be written in Python code using this raw string notation.
+In addition, special escape sequences that are valid in regular expressions,
+but not valid as Python string literals, now result in a
+:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
+which means the sequences will be invalid if raw string notation or escaping
+the backslashes isn't used.
+
+
+-------------------+------------------+
| Regular String | Raw string |
+===================+==================+
@@ -457,12 +466,18 @@ In actual programs, the most common style is to store the
Two pattern methods return all of the matches for a pattern.
:meth:`~re.pattern.findall` returns a list of matching strings::
- >>> p = re.compile('\d+')
+ >>> p = re.compile(r'\d+')
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
['12', '11', '10']
-:meth:`~re.pattern.findall` has to create the entire list before it can be returned as the
-result. The :meth:`~re.pattern.finditer` method returns a sequence of
+The ``r`` prefix, making the literal a raw string literal, is needed in this
+example because escape sequences in a normal "cooked" string literal that are
+not recognized by Python, as opposed to regular expressions, now result in a
+:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
+:ref:`the-backslash-plague`.
+
+:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
+result. The :meth:`~re.Pattern.finditer` method returns a sequence of
:ref:`match object <match-objects>` instances as an :term:`iterator`::
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
@@ -1096,11 +1111,11 @@ following calls::
The module-level function :func:`re.split` adds the RE to be used as the first
argument, but is otherwise the same. ::
- >>> re.split('[\W]+', 'Words, words, words.')
+ >>> re.split(r'[\W]+', 'Words, words, words.')
['Words', 'words', 'words', '']
- >>> re.split('([\W]+)', 'Words, words, words.')
+ >>> re.split(r'([\W]+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
- >>> re.split('[\W]+', 'Words, words, words.', 1)
+ >>> re.split(r'[\W]+', 'Words, words, words.', 1)
['Words', 'words, words.']
diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst
index 9649b9c..b54e150 100644
--- a/Doc/howto/unicode.rst
+++ b/Doc/howto/unicode.rst
@@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
Arabic numerals::
import re
- p = re.compile('\d+')
+ p = re.compile(r'\d+')
s = "Over \u0e55\u0e57 57 flavours"
m = p.search(s)