| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
(GH-16959)" (GH-18767)
This reverts commit e471e72977c83664f13d041c78549140c86c92de.
The mode will be removed from Python 3.10.
|
|
|
|
|
|
| |
Open issue in the BPO indicated a desire to make the implementation of
codecs.open() at parity with io.open(), which implements a try/except to
assure file stream gets closed before an exception is raised.
|
|
|
|
| |
Trying to decode an invalid string with the punycode codec
shoud raise UnicodeError.
|
|
|
|
|
|
|
| |
creating cycles (GH-17246)
Capturing exceptions into names can lead to reference cycles though the __traceback__ attribute of the exceptions in some obscure cases that have been reported previously and fixed individually. As these variables are not used anyway, we can remove the binding to reduce the chances of creating reference cycles.
See for example GH-13135
|
|
|
|
|
| |
open(), io.open(), codecs.open() and fileinput.FileInput no longer
accept "U" ("universal newline") in the file mode. This flag was
deprecated since Python 3.3.
|
|
|
|
| |
The Rot-13 codec is for educational use but does not have unit tests,
dragging down test coverage. This adds a few very simple tests.
|
|
|
|
| |
improves decoding performance (GH-15083)
|
| |
|
|
|
|
|
|
|
| |
* The UTF-8 incremental decoders fails now fast if encounter
a sequence that can't be handled by the error handler.
* The UTF-16 incremental decoders with the surrogatepass error
handler decodes now a lone low surrogate with final=False.
|
|
|
| |
CP65001Test has been removed.
|
| |
|
|
|
|
|
|
| |
A very simple fix. I found this while writing typeshed stubs for StreamRecoder.
https://bugs.python.org/issue33482
|
| |
|
| |
|
|
|
|
| |
The bug occurred when the encoded surrogate character is passed
to the incremental decoder in two chunks.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Fix memory leak in PyUnicode_EncodeLocale() and
PyUnicode_EncodeFSDefault() on error handling.
Changes:
* Fix unicode_encode_locale() error handling
* Fix test_codecs.LocaleCodecTest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for the "surrogatepass" error handler in
PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault()
for the UTF-8 encoding.
Changes:
* _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the
surrogatepass error handler (_Py_ERROR_SURROGATEPASS).
* _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use
the _Py_error_handler enum instead of "int surrogateescape" to pass
the error handler. These functions now return -3 if the error
handler is unknown.
* Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx()
in test_codecs.
* Rename get_error_handler() to _Py_GetErrorHandler() and expose it
as a private function.
* _freeze_importlib doesn't need config.filesystem_errors="strict"
workaround anymore.
|
|
|
|
|
|
|
| |
starting with "+". (GH-8741)
The UTF-7 decoder now raises UnicodeDecodeError for ill-formed
sequences starting with "+" (as specified in RFC 2152).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add -X utf8 command line option, PYTHONUTF8 environment variable
and a new sys.flags.utf8_mode flag.
* If the LC_CTYPE locale is "C" at startup: enable automatically the
UTF-8 mode.
* Add _winapi.GetACP(). encodings._alias_mbcs() now calls
_winapi.GetACP() to get the ANSI code page
* locale.getpreferredencoding() now returns 'UTF-8' in the UTF-8
mode. As a side effect, open() now uses the UTF-8 encoding by
default in this mode.
* Py_DecodeLocale() and Py_EncodeLocale() now use the UTF-8 encoding
in the UTF-8 Mode.
* Update subprocess._args_from_interpreter_flags() to handle -X utf8
* Skip some tests relying on the current locale if the UTF-8 mode is
enabled.
* Add test_utf8mode.py.
* _Py_DecodeUTF8_surrogateescape() gets a new optional parameter to
return also the length (number of wide characters).
* pymain_get_global_config() and pymain_set_global_config() now
always copy flag values, rather than only copying if the new value
is greater than the old value.
|
|
|
|
|
| |
characters/bytes for non-negative n. This makes it compatible with
read() methods of other file-like objects.
|
|
|
|
| |
and in codecs.escape_decode() when decode an escaped non-ascii byte.
|
|\ |
|
| |
| |
| |
| | |
an empty bytestring is passed
|
| |
| |
| |
| | |
Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.
|
| |
| |
| |
| |
| |
| |
| | |
And most of the tools.
Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and
Martin Panter.
|
| |
| |
| |
| | |
lookup
|
| | |
|
|\ \
| |/ |
|
| | |
|
|\ \
| |/ |
|
| |
| |
| |
| | |
This affects documentation, code comments, and a debugging messages.
|
|\ \
| |/ |
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | | |
This changes the main documentation, doc strings, source code comments, and a
couple error messages in the test suite. In some cases the word was removed
or edited some other way to fix the grammar.
|
| | |
| | |
| | |
| | |
| | |
| | | |
Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors().
Add also unit tests for non-BMP characters.
|
| | |
| | |
| | |
| | | |
handlers: ``ignore``, ``replace`` and ``surrogateescape``.
|
|\ \ \
| |/ /
| | |
| | |
| | |
| | |
| | | |
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
|
| |\ \
| | |/
| | |
| | |
| | |
| | |
| | | |
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
|
| | |
| | |
| | |
| | |
| | | |
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
|
| | |
| | |
| | |
| | |
| | | |
handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``.
Patch co-written with Serhiy Storchaka.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Issue #25227: Optimize ASCII and latin1 encoders with the ``surrogateescape``
error handler: the encoders are now up to 3 times as fast.
Initial patch written by Serhiy Storchaka.
|
|/ /
| |
| |
| |
| |
| |
| |
| | |
ignore and replace. Initial patch written by Naoki Inada.
The decoder is now up to 60 times as fast for these error handlers.
Add also unit tests for the ASCII decoder.
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| | |
This changes the equivalent functions listed for the Base-64, hex and Quoted-
Printable codecs to reflect the functions actually used. Also mention and
test the "quotetabs" setting for Quoted-Printable encoding.
|
| | |
|
| | |
|