summaryrefslogtreecommitdiffstats
path: root/Doc/library/re.rst
diff options
context:
space:
mode:
authorSerhiy Storchaka <storchaka@gmail.com>2017-11-16 10:38:26 (GMT)
committerGitHub <noreply@github.com>2017-11-16 10:38:26 (GMT)
commit05cb728d68a278d11466f9a6c8258d914135c96c (patch)
treeda7fd67bdacf4239d820bcf40cad9f60cab9fb82 /Doc/library/re.rst
parent3daaafb700df45716bb55f3a293f88773baf3463 (diff)
downloadcpython-05cb728d68a278d11466f9a6c8258d914135c96c.zip
cpython-05cb728d68a278d11466f9a6c8258d914135c96c.tar.gz
cpython-05cb728d68a278d11466f9a6c8258d914135c96c.tar.bz2
bpo-30349: Raise FutureWarning for nested sets and set operations (#1553)
in regular expressions.
Diffstat (limited to 'Doc/library/re.rst')
-rw-r--r--Doc/library/re.rst16
1 files changed, 15 insertions, 1 deletions
diff --git a/Doc/library/re.rst b/Doc/library/re.rst
index cbb2f43..8c15462 100644
--- a/Doc/library/re.rst
+++ b/Doc/library/re.rst
@@ -200,6 +200,20 @@ The special characters are:
place it at the beginning of the set. For example, both ``[()[\]{}]`` and
``[]()[{}]`` will both match a parenthesis.
+ * Support of nested sets and set operations as in `Unicode Technical
+ Standard #18`_ might be added in the future. This would change the
+ syntax, so to facilitate this change a :exc:`FutureWarning` will be raised
+ in ambiguous cases for the time being.
+ That include sets starting with a literal ``'['`` or containing literal
+ character sequences ``'--'``, ``'&&'``, ``'~~'``, and ``'||'``. To
+ avoid a warning escape them with a backslash.
+
+ .. _Unicode Technical Standard #18: https://unicode.org/reports/tr18/
+
+ .. versionchanged:: 3.7
+ :exc:`FutureWarning` is raised if a character set contains constructs
+ that will change semantically in the future.
+
``|``
``A|B``, where *A* and *B* can be arbitrary REs, creates a regular expression that
will match either *A* or *B*. An arbitrary number of REs can be separated by the
@@ -829,7 +843,7 @@ form.
>>> legal_chars = string.ascii_lowercase + string.digits + "!#$%&'*+-.^_`|~:"
>>> print('[%s]+' % re.escape(legal_chars))
- [abcdefghijklmnopqrstuvwxyz0123456789!\#\$%&'\*\+\-\.\^_`\|~:]+
+ [abcdefghijklmnopqrstuvwxyz0123456789!\#\$%\&'\*\+\-\.\^_`\|\~:]+
>>> operators = ['+', '-', '*', '/', '**']
>>> print('|'.join(map(re.escape, sorted(operators, reverse=True))))