cpython.git - https://github.com/python/cpython.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	bpo-36775: _PyCoreConfig only uses wchar_t* (GH-13062)	Victor Stinner	2019-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	_PyCoreConfig: Change filesystem_encoding, filesystem_errors, stdio_encoding and stdio_errors fields type from char* to wchar_t. Changes: PyInterpreterState: replace fscodec_initialized (int) with fs_codec structure. * Add get_error_handler_wide() and unicode_encode_utf8() helper functions. * Add error_handler parameter to unicode_encode_locale() and unicode_decode_locale(). * Remove _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideString() to _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideStringFromString() to _PyCoreConfig_DecodeLocale().
*	bpo-34523: Support surrogatepass in locale codecs (GH-8995)	Victor Stinner	2018-08-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.
*	bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. ↵	Stefan Krah	2017-08-21	1	-2/+2
\| \| \| \|	(#3157)
*	Issue #28561: Clean up UTF-8 encoder: remove dead code, update comments, etc.	Serhiy Storchaka	2016-10-30	1	-10/+4
\| \| \| \|	Patch by Xiang Zhang.
*	PEP 7 style for if/else in C	Victor Stinner	2016-09-02	1	-1/+2
\| \| \| \|	Add also a newline for readability in normalize_encoding().
*	Issue #27895: Spelling fixes (Contributed by Ville Skyttä).	Raymond Hettinger	2016-08-30	1	-3/+3
\|
*	Issue #26765: Ensure that bytes- and unicode-specific stringlib files are used	Serhiy Storchaka	2016-05-16	1	-3/+3
\| \| \| \|	with correct type.
*	Optimize error handlers of ASCII and Latin1 encoders when the replacement	Victor Stinner	2015-10-09	1	-11/+7
\| \| \| \| \| \| \| \| \| \| \|	string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path
*	Add _PyBytesWriter_WriteBytes() to factorize the code	Victor Stinner	2015-10-09	1	-11/+11
\|
*	_PyBytesWriter: simplify code to avoid "prealloc" parameters	Victor Stinner	2015-10-09	1	-8/+12
\| \| \| \| \|	Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().
*	Optimize backslashreplace error handler	Victor Stinner	2015-10-08	1	-2/+16
\| \| \| \| \| \| \| \| \| \|	Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.
*	Issue #25318: Add _PyBytesWriter API	Victor Stinner	2015-10-08	1	-63/+21
\| \| \| \| \| \| \| \| \| \| \|	Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.
*	Issue #25267: The UTF-8 encoder is now up to 75 times as fast for error	Victor Stinner	2015-10-01	1	-51/+96
\| \| \| \| \|	handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.
*	Fixed typos in comments.	Serhiy Storchaka	2015-05-18	1	-4/+4
\|\
\| *	Fixed typos in comments.	Serhiy Storchaka	2015-05-18	1	-2/+2
\| \|
* \|	Issue #15027: The UTF-32 encoder is now 3x to 7x faster.	Serhiy Storchaka	2015-05-12	1	-0/+87
\|/
*	Reverted changeset b72c5573c5e7 (issue #15027).	Serhiy Storchaka	2014-01-04	1	-87/+0
\|
*	Issue #15027: Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.	Serhiy Storchaka	2014-01-04	1	-0/+87
\|
*	Remove dead code committed in issue #12892.	Serhiy Storchaka	2013-11-19	1	-104/+0
\|
*	Issue #12892: The utf-16* and utf-32* codecs now reject (lone) surrogates.	Serhiy Storchaka	2013-11-19	1	-16/+182
\| \| \| \| \| \| \| \| \| \|	The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
*	Issue #18722: Remove uses of the "register" keyword in C code.	Antoine Pitrou	2013-08-13	1	-3/+3
\|
*	(Merge 3.3) Issue #8271: Fix compilation on Windows	Victor Stinner	2012-11-04	1	-1/+1
\|\
\| *	Issue #8271: Fix compilation on Windows	Victor Stinner	2012-11-04	1	-1/+1
\| \|
* \|	#8271: merge with 3.3.	Ezio Melotti	2012-11-04	1	-30/+62
\|\ \ \| \|/
\| *	#8271: the utf-8 decoder now outputs the correct number of U+FFFD ↵	Ezio Melotti	2012-11-04	1	-30/+62
\| \| \| \| \| \| \| \|	characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.
* \|	Issue #16166: Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified	Christian Heimes	2012-10-17	1	-3/+3
\|/ \| \| \|	endianess detection and handling.
*	Issue #15144: Fix possible integer overflow when handling pointers as ↵	Antoine Pitrou	2012-09-20	1	-9/+5
\| \| \| \| \| \|	integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka.
*	Use correct types for ASCII_CHAR_MASK integer constants.	Mark Dickinson	2012-07-07	1	-2/+2
\|
*	Issue #14923: Optimize continuation-byte check in UTF-8 decoding. Patch by ↵	Mark Dickinson	2012-06-23	1	-6/+10
\| \| \| \|	Serhiy Storchaka.
*	Issue #15026: utf-16 encoding is now significantly faster (up to 10x).	Antoine Pitrou	2012-06-15	1	-0/+64
\| \| \| \|	Patch by Serhiy Storchaka.
*	Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.	Antoine Pitrou	2012-05-15	1	-1/+148
\| \| \| \|	Patch by Serhiy Storchaka.
*	Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy ↵	Antoine Pitrou	2012-05-10	1	-78/+143
\| \| \| \|	Storchaka.
*	Issue #13624: Write a specialized UTF-8 encoder to allow more optimization	Victor Stinner	2011-12-18	1	-0/+197
\| \| \| \|	The main bottleneck was the PyUnicode_READ() macro.
*	Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.	Antoine Pitrou	2011-11-21	1	-0/+156
	This almost catches up with pre-PEP 393 performance, when decoding needed only one pass.