diff options
author | Antoine Pitrou <solipsis@pitrou.net> | 2008-08-19 17:56:33 (GMT) |
---|---|---|
committer | Antoine Pitrou <solipsis@pitrou.net> | 2008-08-19 17:56:33 (GMT) |
commit | fd036451bf0e0ade8783e21df801abf7be96d020 (patch) | |
tree | e70ff65a9e641d8e790bc091f0dc2507baf344ca /Misc | |
parent | 3ad7ba10a20827b24d4b1aa9dd49474db8affbdd (diff) | |
download | cpython-fd036451bf0e0ade8783e21df801abf7be96d020.zip cpython-fd036451bf0e0ade8783e21df801abf7be96d020.tar.gz cpython-fd036451bf0e0ade8783e21df801abf7be96d020.tar.bz2 |
#2834: Change re module semantics, so that str and bytes mixing is forbidden,
and str (unicode) patterns get full unicode matching by default. The re.ASCII
flag is also introduced to ask for ASCII matching instead.
Diffstat (limited to 'Misc')
-rw-r--r-- | Misc/NEWS | 8 |
1 files changed, 8 insertions, 0 deletions
@@ -30,6 +30,14 @@ Core and Builtins Library ------- +- Issue #2834: update the regular expression library to match the unicode + standards of py3k. In other words, mixing bytes and unicode strings + (be it as pattern, search string or replacement string) raises a TypeError. + Moreover, the re.UNICODE flag is enabled automatically for unicode patterns, + and can be disabled by specifying a new re.ASCII flag; as for bytes + patterns, ASCII matching is the only option and trying to specify re.UNICODE + for such patterns raises a ValueError. + - Issue #3300: make urllib.parse.[un]quote() default to UTF-8. Code contributed by Matt Giuca. quote() now encodes the input before quoting, unquote() decodes after unquoting. There are |