Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. ↵ | Stefan Krah | 2017-08-21 | 1 | -2/+2 |
| | | | | (#3157) | ||||
* | Issue #28561: Clean up UTF-8 encoder: remove dead code, update comments, etc. | Serhiy Storchaka | 2016-10-30 | 1 | -10/+4 |
| | | | | Patch by Xiang Zhang. | ||||
* | PEP 7 style for if/else in C | Victor Stinner | 2016-09-02 | 1 | -1/+2 |
| | | | | Add also a newline for readability in normalize_encoding(). | ||||
* | Issue #27895: Spelling fixes (Contributed by Ville Skyttä). | Raymond Hettinger | 2016-08-30 | 1 | -3/+3 |
| | |||||
* | Issue #26765: Ensure that bytes- and unicode-specific stringlib files are used | Serhiy Storchaka | 2016-05-16 | 1 | -3/+3 |
| | | | | with correct type. | ||||
* | Optimize error handlers of ASCII and Latin1 encoders when the replacement | Victor Stinner | 2015-10-09 | 1 | -11/+7 |
| | | | | | | | | | | | string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path | ||||
* | Add _PyBytesWriter_WriteBytes() to factorize the code | Victor Stinner | 2015-10-09 | 1 | -11/+11 |
| | |||||
* | _PyBytesWriter: simplify code to avoid "prealloc" parameters | Victor Stinner | 2015-10-09 | 1 | -8/+12 |
| | | | | | Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare(). | ||||
* | Optimize backslashreplace error handler | Victor Stinner | 2015-10-08 | 1 | -2/+16 |
| | | | | | | | | | | Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler. | ||||
* | Issue #25318: Add _PyBytesWriter API | Victor Stinner | 2015-10-08 | 1 | -63/+21 |
| | | | | | | | | | | | Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII. | ||||
* | Issue #25267: The UTF-8 encoder is now up to 75 times as fast for error | Victor Stinner | 2015-10-01 | 1 | -51/+96 |
| | | | | | handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka. | ||||
* | Fixed typos in comments. | Serhiy Storchaka | 2015-05-18 | 1 | -4/+4 |
|\ | |||||
| * | Fixed typos in comments. | Serhiy Storchaka | 2015-05-18 | 1 | -2/+2 |
| | | |||||
* | | Issue #15027: The UTF-32 encoder is now 3x to 7x faster. | Serhiy Storchaka | 2015-05-12 | 1 | -0/+87 |
|/ | |||||
* | Reverted changeset b72c5573c5e7 (issue #15027). | Serhiy Storchaka | 2014-01-04 | 1 | -87/+0 |
| | |||||
* | Issue #15027: Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster. | Serhiy Storchaka | 2014-01-04 | 1 | -0/+87 |
| | |||||
* | Remove dead code committed in issue #12892. | Serhiy Storchaka | 2013-11-19 | 1 | -104/+0 |
| | |||||
* | Issue #12892: The utf-16* and utf-32* codecs now reject (lone) surrogates. | Serhiy Storchaka | 2013-11-19 | 1 | -16/+182 |
| | | | | | | | | | | The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu. | ||||
* | Issue #18722: Remove uses of the "register" keyword in C code. | Antoine Pitrou | 2013-08-13 | 1 | -3/+3 |
| | |||||
* | (Merge 3.3) Issue #8271: Fix compilation on Windows | Victor Stinner | 2012-11-04 | 1 | -1/+1 |
|\ | |||||
| * | Issue #8271: Fix compilation on Windows | Victor Stinner | 2012-11-04 | 1 | -1/+1 |
| | | |||||
* | | #8271: merge with 3.3. | Ezio Melotti | 2012-11-04 | 1 | -30/+62 |
|\ \ | |/ | |||||
| * | #8271: the utf-8 decoder now outputs the correct number of U+FFFD ↵ | Ezio Melotti | 2012-11-04 | 1 | -30/+62 |
| | | | | | | | | characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti. | ||||
* | | Issue #16166: Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified | Christian Heimes | 2012-10-17 | 1 | -3/+3 |
|/ | | | | endianess detection and handling. | ||||
* | Issue #15144: Fix possible integer overflow when handling pointers as ↵ | Antoine Pitrou | 2012-09-20 | 1 | -9/+5 |
| | | | | | | integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka. | ||||
* | Use correct types for ASCII_CHAR_MASK integer constants. | Mark Dickinson | 2012-07-07 | 1 | -2/+2 |
| | |||||
* | Issue #14923: Optimize continuation-byte check in UTF-8 decoding. Patch by ↵ | Mark Dickinson | 2012-06-23 | 1 | -6/+10 |
| | | | | Serhiy Storchaka. | ||||
* | Issue #15026: utf-16 encoding is now significantly faster (up to 10x). | Antoine Pitrou | 2012-06-15 | 1 | -0/+64 |
| | | | | Patch by Serhiy Storchaka. | ||||
* | Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs. | Antoine Pitrou | 2012-05-15 | 1 | -1/+148 |
| | | | | Patch by Serhiy Storchaka. | ||||
* | Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy ↵ | Antoine Pitrou | 2012-05-10 | 1 | -78/+143 |
| | | | | Storchaka. | ||||
* | Issue #13624: Write a specialized UTF-8 encoder to allow more optimization | Victor Stinner | 2011-12-18 | 1 | -0/+197 |
| | | | | The main bottleneck was the PyUnicode_READ() macro. | ||||
* | Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case. | Antoine Pitrou | 2011-11-21 | 1 | -0/+156 |
This almost catches up with pre-PEP 393 performance, when decoding needed only one pass. |