summaryrefslogtreecommitdiffstats
path: root/Objects/stringlib
Commit message (Collapse)AuthorAgeFilesLines
* rewrite the parsing of field names to be more consistent wrt recursive expansionBenjamin Peterson2013-05-171-62/+53
|
* merge 3.3Benjamin Peterson2013-05-171-2/+8
|\
| * only recursively expand in the format spec (closes #17644)Benjamin Peterson2013-05-171-2/+8
| |
* | Merge removal of trailing whitespace from 3.3.Ezio Melotti2013-04-211-7/+7
|\ \ | |/
| * Remove trailing whitespace.Ezio Melotti2013-04-211-7/+7
| |
* | Close #17694: Add minimum length to _PyUnicodeWriterVictor Stinner2013-04-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | * Add also min_char attribute to _PyUnicodeWriter structure (currently unused) * _PyUnicodeWriter_Init() has no more argument (except the writer itself): min_length and overallocate must be set explicitly * In error handlers, only enable overallocation if the replacement string is longer than 1 character * CJK decoders don't use overallocation anymore * Set min_length, instead of preallocating memory using _PyUnicodeWriter_Prepare(), in many decoders * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
* | stringlib: remove unused STRINGLIB_RESIZE macroVictor Stinner2013-04-147-7/+0
| |
* | Issue #16061: Speed up str.replace() for replacing 1-character strings.Serhiy Storchaka2013-04-131-0/+53
| |
* | Close #13126: "Simplify" FASTSEARCH() code to help the compiler to emit moreVictor Stinner2013-04-071-3/+5
| | | | | | | | | | | | | | efficient machine code. Patch written by Antoine Pitrou. Without this change, str.find() was 10% slower than str.rfind() in the worst case.
* | Add _PyUnicodeWriter_WriteSubstring() functionVictor Stinner2013-04-021-12/+6
| | | | | | | | | | | | | | | | | | Write a function to enable more optimizations: * If the substring is the whole string and overallocation is disabled, just keep a reference to the string, don't copy characters * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when possible
* | Remove unused defines.Serhiy Storchaka2013-02-231-6/+0
|\ \ | |/
| * Remove unused defines.Serhiy Storchaka2013-02-231-6/+0
| |
* | Check for NULL before the pointer aligning in fastsearch_memchr_1char.Serhiy Storchaka2013-01-151-15/+10
|\ \ | |/ | | | | There is no guarantee that NULL is aligned.
| * Check for NULL before the pointer aligning in fastsearch_memchr_1char.Serhiy Storchaka2013-01-151-15/+10
| | | | | | | | There is no guarantee that NULL is aligned.
* | Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation ↵Christian Heimes2012-12-021-0/+1
| | | | | | | | failure
* | (Merge 3.3) Issue #8271: Fix compilation on WindowsVictor Stinner2012-11-041-1/+1
|\ \ | |/
| * Issue #8271: Fix compilation on WindowsVictor Stinner2012-11-041-1/+1
| |
* | #8271: merge with 3.3.Ezio Melotti2012-11-041-30/+62
|\ \ | |/
| * #8271: the utf-8 decoder now outputs the correct number of U+FFFD ↵Ezio Melotti2012-11-041-30/+62
| | | | | | | | characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.
* | Issue #12805: Make bytes.join and bytearray.join faster when the separator ↵Antoine Pitrou2012-10-201-0/+10
| | | | | | | | | | | | is empty. Patch by Serhiy Storchaka.
* | Issue #16166: Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unifiedChristian Heimes2012-10-171-3/+3
| | | | | | | | endianess detection and handling.
* | Issue #15958: bytes.join and bytearray.join now accept arbitrary buffer objects.Antoine Pitrou2012-10-161-0/+122
|/
* Issue #15144: Fix possible integer overflow when handling pointers as ↵Antoine Pitrou2012-09-203-18/+10
| | | | | | integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka.
* Close #15534: Fix a typo in the fast search function of the string library ↵Victor Stinner2012-08-021-5/+5
| | | | | | (_s => s) Replace _s with ptr to avoid future confusion. Add also non regression tests.
* Use correct types for ASCII_CHAR_MASK integer constants.Mark Dickinson2012-07-072-4/+4
|
* Issue #14923: Optimize continuation-byte check in UTF-8 decoding. Patch by ↵Mark Dickinson2012-06-231-6/+10
| | | | Serhiy Storchaka.
* Make private function static (from `make smelly`)Antoine Pitrou2012-06-211-1/+1
|
* Issue #15026: utf-16 encoding is now significantly faster (up to 10x).Antoine Pitrou2012-06-151-0/+64
| | | | Patch by Serhiy Storchaka.
* Issue #14993: Use standard "unsigned char" instead of a unsigned char bitfieldVictor Stinner2012-06-041-1/+1
|
* Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args ↵Victor Stinner2012-05-292-26/+22
| | | | | | | | | | | | | | | | | and str.format(args) * Formatting string, int, float and complex use the _PyUnicodeWriter API. It avoids a temporary buffer in most cases. * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just keep a reference to the string if the output is only composed of one string * Disable overallocation when formatting the last argument of str%args and str.format(args) * Overallocation allocates at least 100 characters: add min_length attribute to the _PyUnicodeWriter structure * Add new private functions: _PyUnicode_FastCopyCharacters(), _PyUnicode_FastFill() and _PyUnicode_FromASCII() The speed up is around 20% in average.
* Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.Antoine Pitrou2012-05-151-1/+148
| | | | Patch by Serhiy Storchaka.
* Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy ↵Antoine Pitrou2012-05-106-78/+148
| | | | Storchaka.
* Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"Victor Stinner2012-05-091-9/+9
|
* Issue #14744: Inline unicode_writer_write_char() and unicode_write_str()Victor Stinner2012-05-091-10/+26
| | | | | Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once per argument.
* Close #14716: str.format() now uses the new "unicode writer" API instead of theVictor Stinner2012-05-071-41/+19
| | | | PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.
* Issue #14387: Do not include accu.h from Python.h.Antoine Pitrou2012-03-221-0/+1
|\
* | Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separatorVictor Stinner2012-02-238-66/+23
| | | | | | | | | | | | | | | | | | | | | | * Decode thousands separator and decimal point using PyUnicode_DecodeLocale() (from the locale encoding), instead of decoding them implicitly from latin1 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum character if unicode is NULL * Replace MIN/MAX macros by Py_MIN/Py_MAX * stringlib/undef.h undefines STRINGLIB_IS_UNICODE * stringlib/localeutil.h only supports Unicode
* | remove some usage of Py_UNICODE_TOUPPER/LOWERBenjamin Peterson2012-01-126-12/+0
| |
* | Issue #13624: Write a specialized UTF-8 encoder to allow more optimizationVictor Stinner2011-12-181-0/+197
| | | | | | | | The main bottleneck was the PyUnicode_READ() macro.
* | Issue #13623: Fix a performance regression introduced by issue #12170 inVictor Stinner2011-12-181-10/+17
| | | | | | | | | | bytes.find() and handle correctly OverflowError (raise the same ValueError than the error for -1).
* | Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)Victor Stinner2011-12-011-1/+1
| | | | | | | | Create an empty string with the new Unicode API.
* | Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.Antoine Pitrou2011-11-211-0/+156
| | | | | | | | | | This almost catches up with pre-PEP 393 performance, when decoding needed only one pass.
* | stringlib: remove unused STRINGLIB_FILLVictor Stinner2011-11-206-6/+0
| |
* | Replace PyUnicodeObject type by PyObjectVictor Stinner2011-11-031-12/+8
| | | | | | | | | | * _PyUnicode_CheckConsistency() now takes a PyObject* instead of void* * Remove now useless casts to PyObject*
* | Replace PyUnicodeObject* by PyObject* where it was irrevelantVictor Stinner2011-10-231-4/+4
| | | | | | | | | | | | A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to PyUnicodeObject* is wrong
* | Issue #12170: The count(), find(), rfind(), index() and rindex() methodsAntoine Pitrou2011-10-201-0/+43
| | | | | | | | | | of bytes and bytearray objects now accept an integer between 0 and 255 as their first argument. Patch by Petri Lehtinen.
* | Fix typoAntoine Pitrou2011-10-171-1/+1
| |
* | Add a comment explaining this heuristic.Antoine Pitrou2011-10-131-0/+3
| |
* | Simplify heuristic for when to use memchrAntoine Pitrou2011-10-131-11/+1
| |
* | Issue #13155: Optimize finding the optimal character width of an unicode stringAntoine Pitrou2011-10-121-0/+136
| |