summaryrefslogtreecommitdiffstats
path: root/Objects/stringlib
Commit message (Collapse)AuthorAgeFilesLines
* bpo-34523: Support surrogatepass in locale codecs (GH-8995)Victor Stinner2018-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.
* bpo-20180: complete AC conversion of Objects/stringlib/transmogrify.h (GH-8039)Tal Einat2018-07-062-33/+233
| | | | * converted bytes methods: expandtabs, ljust, rjust, center, zfill * updated char_convertor to properly set the C default value
* bpo-33012: Fix invalid function cast warnings with gcc 8 for METH_NOARGS. ↵Siddhesh Poyarekar2018-04-291-13/+13
| | | | | | | | | (GH-6030) METH_NOARGS functions need only a single argument but they are cast into a PyCFunction, which takes two arguments. This triggers an invalid function cast warning in gcc8 due to the argument mismatch. Fix this by adding a dummy unused argument.
* bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342)INADA Naoki2018-01-271-0/+6
|
* bpo-31338 (#3374)Barry Warsaw2017-09-151-2/+1
| | | | | | | * Add Py_UNREACHABLE() as an alias to abort(). * Use Py_UNREACHABLE() instead of assert(0) * Convert more unreachable code to use Py_UNREACHABLE() * Document Py_UNREACHABLE() and a few other macros.
* bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. ↵Stefan Krah2017-08-211-2/+2
| | | | (#3157)
* bpo-30978: str.format_map() now passes key lookup exceptions through. (#2790)Serhiy Storchaka2017-08-031-6/+10
| | | Previously any exception was replaced with a KeyError exception.
* bpo-24821: Fixed the slowing down to 25 times in the searching of some (#505)Serhiy Storchaka2017-03-301-6/+40
| | | | unlucky Unicode characters.
* Issue #28999: Use Py_RETURN_NONE, Py_RETURN_TRUE and Py_RETURN_FALSE whereverSerhiy Storchaka2017-01-231-4/+2
| | | | possible but Coccinelle couldn't find opportunity.
* Issue #29145: Merge 3.6.Xiang Zhang2017-01-101-1/+1
|
* Issue #28561: Clean up UTF-8 encoder: remove dead code, update comments, etc.Serhiy Storchaka2016-10-301-10/+4
| | | | Patch by Xiang Zhang.
* Issue #28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly ↵Christian Heimes2016-09-132-23/+23
| | | | optimize memcpy().
* remove all usage of Py_LOCALBenjamin Peterson2016-09-091-11/+11
|
* PEP 7 style for if/else in CVictor Stinner2016-09-021-1/+2
| | | | Add also a newline for readability in normalize_encoding().
* Issue #27895: Spelling fixes (Contributed by Ville Skyttä).Raymond Hettinger2016-08-301-3/+3
|
* Backed out changeset b0087e17cd5e (issue #26765)Serhiy Storchaka2016-07-031-54/+0
| | | | For unknown reasons it perhaps caused a crash on 32-bit Windows (issue #).
* Issue #26765: Moved wrappers for bytes and bytearray methods to common headerSerhiy Storchaka2016-07-011-0/+54
| | | | file.
* Issue #26765: Ensure that bytes- and unicode-specific stringlib files are usedSerhiy Storchaka2016-05-166-12/+15
| | | | with correct type.
* Issue #26765: Moved common code for the replace() method of bytes and bytearraySerhiy Storchaka2016-05-051-57/+521
| | | | to a template file.
* Issue #26765: Moved common code and docstrings for bytes and bytearray methodsSerhiy Storchaka2016-05-042-103/+0
| | | | to bytes_methods.c.
* Issue #26778: Fixed "a/an/and" typos in code comment, documentation and errorSerhiy Storchaka2016-04-171-1/+1
|\ | | | | | | messages.
| * Issue #26778: Fixed "a/an/and" typos in code comment and documentation.Serhiy Storchaka2016-04-171-1/+1
| |
* | Issue #26057: Got rid of nonneeded use of PyUnicode_FromObject().Serhiy Storchaka2016-04-131-11/+2
| |
* | Issue #24821: Refactor STRINGLIB(fastsearch_memchr_1char) and split it onSerhiy Storchaka2015-11-141-63/+87
| | | | | | | | | | STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly without special preconditions.
* | Optimize error handlers of ASCII and Latin1 encoders when the replacementVictor Stinner2015-10-091-11/+7
| | | | | | | | | | | | | | | | | | | | | | string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path
* | Add _PyBytesWriter_WriteBytes() to factorize the codeVictor Stinner2015-10-091-11/+11
| |
* | _PyBytesWriter: simplify code to avoid "prealloc" parametersVictor Stinner2015-10-091-8/+12
| | | | | | | | | | Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().
* | Optimize backslashreplace error handlerVictor Stinner2015-10-081-2/+16
| | | | | | | | | | | | | | | | | | | | Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.
* | Issue #25318: Add _PyBytesWriter APIVictor Stinner2015-10-081-63/+21
| | | | | | | | | | | | | | | | | | | | | | Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.
* | Issue #25267: The UTF-8 encoder is now up to 75 times as fast for errorVictor Stinner2015-10-011-51/+96
| | | | | | | | | | handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.
* | Fixed an incorrect comment.Eric V. Smith2015-08-261-1/+1
|/
* Fixed typos in comments.Serhiy Storchaka2015-05-181-4/+4
|\
| * Fixed typos in comments.Serhiy Storchaka2015-05-181-2/+2
| |
* | Issue #15027: The UTF-32 encoder is now 3x to 7x faster.Serhiy Storchaka2015-05-121-0/+87
| |
* | Issue #23573: Increased performance of string search operations (str.find,Serhiy Storchaka2015-03-242-23/+4
| | | | | | | | | | str.index, str.count, the in operator, str.split, str.partition) with arguments of different kinds (UCS1, UCS2, UCS4).
* | Removed unintentional trailing spaces in non-external and non-generated C files.Serhiy Storchaka2015-03-181-1/+1
|/
* Issue #22896: Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer()Serhiy Storchaka2015-02-021-1/+8
| | | | and PyObject_AsWriteBuffer().
* Issue #22581: Use more "bytes-like object" throughout the docs and comments.Serhiy Storchaka2014-12-051-3/+3
|
* s/stringobject/bytesobject/ (closes #22036)Benjamin Peterson2014-07-241-1/+1
| | | | Patch by Martin Matusiak.
* merge 3.3Benjamin Peterson2014-03-301-19/+19
|\
| * merge 3.2Benjamin Peterson2014-03-301-19/+19
| |\
| | * fix expandtabs overflow detection to be consistent and not rely on signed ↵Benjamin Peterson2014-03-301-19/+19
| | | | | | | | | | | | overflow
| | * Issue #17173: Remove uses of locale-dependent C functions (isalpha() etc.) ↵Antoine Pitrou2013-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | in the interpreter. I've left a couple of them in: zlib (third-party lib), getaddrinfo.c (doesn't include Python.h, and probably obsolete), _sre.c (legitimate use for the re.LOCALE flag).
| | * Issue #14700: Fix buggy overflow checks for large precision and width in ↵Mark Dickinson2012-10-282-18/+13
| | | | | | | | | | | | new-style and old-style formatting.
| * | fix format spec recursive expansion (closes #19729)Benjamin Peterson2013-11-271-2/+4
| | |
* | | Reverted changeset b72c5573c5e7 (issue #15027).Serhiy Storchaka2014-01-041-87/+0
| | |
* | | Issue #15027: Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.Serhiy Storchaka2014-01-041-0/+87
| | |
* | | Remove dead code committed in issue #12892.Serhiy Storchaka2013-11-191-104/+0
| | |
* | | Issue #12892: The utf-16* and utf-32* codecs now reject (lone) surrogates.Serhiy Storchaka2013-11-191-16/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
* | | #17806: Added keyword-argument support for "tabsize" to str/bytes.expandtabs().Ezio Melotti2013-11-161-3/+5
| | |