summaryrefslogtreecommitdiffstats
path: root/Doc/library/re.rst
diff options
context:
space:
mode:
authorSerhiy Storchaka <storchaka@gmail.com>2017-12-04 12:29:05 (GMT)
committerGitHub <noreply@github.com>2017-12-04 12:29:05 (GMT)
commit70d56fb52582d9d3f7c00860d6e90570c6259371 (patch)
tree61e54b78f19535bfcf41d521b98def725de63497 /Doc/library/re.rst
parente69fbb6a560a02d0587b9075afd338a1e9073af0 (diff)
downloadcpython-70d56fb52582d9d3f7c00860d6e90570c6259371.zip
cpython-70d56fb52582d9d3f7c00860d6e90570c6259371.tar.gz
cpython-70d56fb52582d9d3f7c00860d6e90570c6259371.tar.bz2
bpo-25054, bpo-1647489: Added support of splitting on zerowidth patterns. (#4471)
Also fixed searching patterns that could match an empty string.
Diffstat (limited to 'Doc/library/re.rst')
-rw-r--r--Doc/library/re.rst46
1 files changed, 16 insertions, 30 deletions
diff --git a/Doc/library/re.rst b/Doc/library/re.rst
index 8e6eb30..dae1d7e 100644
--- a/Doc/library/re.rst
+++ b/Doc/library/re.rst
@@ -708,37 +708,19 @@ form.
That way, separator components are always found at the same relative
indices within the result list.
- .. note::
-
- :func:`split` doesn't currently split a string on an empty pattern match.
- For example::
-
- >>> re.split('x*', 'axbc')
- ['a', 'bc']
+ The pattern can match empty strings. ::
- Even though ``'x*'`` also matches 0 'x' before 'a', between 'b' and 'c',
- and after 'c', currently these matches are ignored. The correct behavior
- (i.e. splitting on empty matches too and returning ``['', 'a', 'b', 'c',
- '']``) will be implemented in future versions of Python, but since this
- is a backward incompatible change, a :exc:`FutureWarning` will be raised
- in the meanwhile.
-
- Patterns that can only match empty strings currently never split the
- string. Since this doesn't match the expected behavior, a
- :exc:`ValueError` will be raised starting from Python 3.5::
-
- >>> re.split("^$", "foo\n\nbar\n", flags=re.M)
- Traceback (most recent call last):
- File "<stdin>", line 1, in <module>
- ...
- ValueError: split() requires a non-empty pattern match.
+ >>> re.split(r'\b', 'Words, words, words.')
+ ['', 'Words', ', ', 'words', ', ', 'words', '.']
+ >>> re.split(r'(\W*)', '...words...')
+ ['', '...', 'w', '', 'o', '', 'r', '', 'd', '', 's', '...', '']
.. versionchanged:: 3.1
Added the optional flags argument.
- .. versionchanged:: 3.5
- Splitting on a pattern that could match an empty string now raises
- a warning. Patterns that can only match empty strings are now rejected.
+ .. versionchanged:: 3.7
+ Added support of splitting on a pattern that could match an empty string.
+
.. function:: findall(pattern, string, flags=0)
@@ -746,8 +728,10 @@ form.
strings. The *string* is scanned left-to-right, and matches are returned in
the order found. If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern has more than
- one group. Empty matches are included in the result unless they touch the
- beginning of another match.
+ one group. Empty matches are included in the result.
+
+ .. versionchanged:: 3.7
+ Non-empty matches can now start just after a previous empty match.
.. function:: finditer(pattern, string, flags=0)
@@ -755,8 +739,10 @@ form.
Return an :term:`iterator` yielding :ref:`match objects <match-objects>` over
all non-overlapping matches for the RE *pattern* in *string*. The *string*
is scanned left-to-right, and matches are returned in the order found. Empty
- matches are included in the result unless they touch the beginning of another
- match.
+ matches are included in the result.
+
+ .. versionchanged:: 3.7
+ Non-empty matches can now start just after a previous empty match.
.. function:: sub(pattern, repl, string, count=0, flags=0)