summaryrefslogtreecommitdiffstats
path: root/Objects/stringlib
Commit message (Collapse)AuthorAgeFilesLines
* merge 3.2Benjamin Peterson2014-03-301-19/+19
|\
| * fix expandtabs overflow detection to be consistent and not rely on signed ↵Benjamin Peterson2014-03-301-19/+19
| | | | | | | | overflow
| * Issue #17173: Remove uses of locale-dependent C functions (isalpha() etc.) ↵Antoine Pitrou2013-02-091-1/+1
| | | | | | | | | | | | | | | | in the interpreter. I've left a couple of them in: zlib (third-party lib), getaddrinfo.c (doesn't include Python.h, and probably obsolete), _sre.c (legitimate use for the re.LOCALE flag).
| * Issue #14700: Fix buggy overflow checks for large precision and width in ↵Mark Dickinson2012-10-282-18/+13
| | | | | | | | new-style and old-style formatting.
* | fix format spec recursive expansion (closes #19729)Benjamin Peterson2013-11-271-2/+4
| |
* | Issue 18719: Remove a false optimizationRaymond Hettinger2013-08-141-9/+0
| | | | | | | | | | | | | | | | | | | | Remove an unused early-out test from the critical path for dict and set lookups. When the strings already have matching lengths, kinds, and hashes, there is no additional information gained by checking the first characters (the probability of a mismatch is already known to be less than 1 in 2**64).
* | only recursively expand in the format spec (closes #17644)Benjamin Peterson2013-05-171-2/+8
| |
* | Remove trailing whitespace.Ezio Melotti2013-04-211-7/+7
| |
* | Remove unused defines.Serhiy Storchaka2013-02-231-6/+0
| |
* | Check for NULL before the pointer aligning in fastsearch_memchr_1char.Serhiy Storchaka2013-01-151-15/+10
| | | | | | | | There is no guarantee that NULL is aligned.
* | Issue #8271: Fix compilation on WindowsVictor Stinner2012-11-041-1/+1
| |
* | #8271: the utf-8 decoder now outputs the correct number of U+FFFD ↵Ezio Melotti2012-11-041-30/+62
| | | | | | | | characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.
* | Issue #15144: Fix possible integer overflow when handling pointers as ↵Antoine Pitrou2012-09-203-18/+10
| | | | | | | | | | | | integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka.
* | Close #15534: Fix a typo in the fast search function of the string library ↵Victor Stinner2012-08-021-5/+5
| | | | | | | | | | | | (_s => s) Replace _s with ptr to avoid future confusion. Add also non regression tests.
* | Use correct types for ASCII_CHAR_MASK integer constants.Mark Dickinson2012-07-072-4/+4
| |
* | Issue #14923: Optimize continuation-byte check in UTF-8 decoding. Patch by ↵Mark Dickinson2012-06-231-6/+10
| | | | | | | | Serhiy Storchaka.
* | Make private function static (from `make smelly`)Antoine Pitrou2012-06-211-1/+1
| |
* | Issue #15026: utf-16 encoding is now significantly faster (up to 10x).Antoine Pitrou2012-06-151-0/+64
| | | | | | | | Patch by Serhiy Storchaka.
* | Issue #14993: Use standard "unsigned char" instead of a unsigned char bitfieldVictor Stinner2012-06-041-1/+1
| |
* | Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args ↵Victor Stinner2012-05-292-26/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | and str.format(args) * Formatting string, int, float and complex use the _PyUnicodeWriter API. It avoids a temporary buffer in most cases. * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just keep a reference to the string if the output is only composed of one string * Disable overallocation when formatting the last argument of str%args and str.format(args) * Overallocation allocates at least 100 characters: add min_length attribute to the _PyUnicodeWriter structure * Add new private functions: _PyUnicode_FastCopyCharacters(), _PyUnicode_FastFill() and _PyUnicode_FromASCII() The speed up is around 20% in average.
* | Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.Antoine Pitrou2012-05-151-1/+148
| | | | | | | | Patch by Serhiy Storchaka.
* | Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy ↵Antoine Pitrou2012-05-106-78/+148
| | | | | | | | Storchaka.
* | Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"Victor Stinner2012-05-091-9/+9
| |
* | Issue #14744: Inline unicode_writer_write_char() and unicode_write_str()Victor Stinner2012-05-091-10/+26
| | | | | | | | | | Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once per argument.
* | Close #14716: str.format() now uses the new "unicode writer" API instead of theVictor Stinner2012-05-071-41/+19
| | | | | | | | PyAccu API. For example, it makes str.format() from 25% to 30% faster on Linux.
* | Issue #14387: Do not include accu.h from Python.h.Antoine Pitrou2012-03-221-0/+1
|\ \ | |/
* | Issue #13706: Fix format(int, "n") for locale with non-ASCII thousands separatorVictor Stinner2012-02-238-66/+23
| | | | | | | | | | | | | | | | | | | | | | * Decode thousands separator and decimal point using PyUnicode_DecodeLocale() (from the locale encoding), instead of decoding them implicitly from latin1 * Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used * Change _PyUnicode_InsertThousandsGrouping() API to return the maximum character if unicode is NULL * Replace MIN/MAX macros by Py_MIN/Py_MAX * stringlib/undef.h undefines STRINGLIB_IS_UNICODE * stringlib/localeutil.h only supports Unicode
* | remove some usage of Py_UNICODE_TOUPPER/LOWERBenjamin Peterson2012-01-126-12/+0
| |
* | Issue #13624: Write a specialized UTF-8 encoder to allow more optimizationVictor Stinner2011-12-181-0/+197
| | | | | | | | The main bottleneck was the PyUnicode_READ() macro.
* | Issue #13623: Fix a performance regression introduced by issue #12170 inVictor Stinner2011-12-181-10/+17
| | | | | | | | | | bytes.find() and handle correctly OverflowError (raise the same ValueError than the error for -1).
* | Replace PyUnicode_FromUnicode(NULL, 0) by PyUnicode_New(0, 0)Victor Stinner2011-12-011-1/+1
| | | | | | | | Create an empty string with the new Unicode API.
* | Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.Antoine Pitrou2011-11-211-0/+156
| | | | | | | | | | This almost catches up with pre-PEP 393 performance, when decoding needed only one pass.
* | stringlib: remove unused STRINGLIB_FILLVictor Stinner2011-11-206-6/+0
| |
* | Replace PyUnicodeObject type by PyObjectVictor Stinner2011-11-031-12/+8
| | | | | | | | | | * _PyUnicode_CheckConsistency() now takes a PyObject* instead of void* * Remove now useless casts to PyObject*
* | Replace PyUnicodeObject* by PyObject* where it was irrevelantVictor Stinner2011-10-231-4/+4
| | | | | | | | | | | | A Unicode string can now be a PyASCIIObject, PyCompactUnicodeObject or PyUnicodeObject. Aliasing a PyASCIIObject* or PyCompactUnicodeObject* to PyUnicodeObject* is wrong
* | Issue #12170: The count(), find(), rfind(), index() and rindex() methodsAntoine Pitrou2011-10-201-0/+43
| | | | | | | | | | of bytes and bytearray objects now accept an integer between 0 and 255 as their first argument. Patch by Petri Lehtinen.
* | Fix typoAntoine Pitrou2011-10-171-1/+1
| |
* | Add a comment explaining this heuristic.Antoine Pitrou2011-10-131-0/+3
| |
* | Simplify heuristic for when to use memchrAntoine Pitrou2011-10-131-11/+1
| |
* | Issue #13155: Optimize finding the optimal character width of an unicode stringAntoine Pitrou2011-10-121-0/+136
| |
* | stringlib: Fix STRINGLIB_STR for UCS2/UCS4Victor Stinner2011-10-112-2/+2
| |
* | Fix fastsearch for UCS2 and UCS4Victor Stinner2011-10-118-2/+15
| | | | | | | | | | * If needle is 0, try (p[0] >> 16) & 0xff for UCS4 * Disable fastsearch_memchr_1char() if needle is zero for UCS2 and UCS4
* | Issue #13134: optimize finding single-character strings using memchrAntoine Pitrou2011-10-111-0/+73
| |
* | Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE.Martin v. Löwis2011-10-071-1/+1
| |
* | Fix massive slowdown in string formatting with str.format.Antoine Pitrou2011-10-071-128/+24
| | | | | | | | | | | | | | | | | | | | Example: ./python -m timeit -s "f='{}' + '-' * 1024 + '{}'; s='abcd' * 16384" "f.format(s, s)" -> before: 547 usec per loop -> after: 13 usec per loop -> 3.2: 22.5 usec per loop -> 2.7: 12.6 usec per loop
* | Fix compilation warnings under 64-bit WindowsAntoine Pitrou2011-10-061-1/+1
| |
* | Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCIIVictor Stinner2011-10-051-0/+34
| | | | | | | | | | | | ucs1, ucs2 and ucs4 libraries have to scan created substring to find the maximum character, whereas it is not need to ASCII strings. Because ASCII strings are common, it is useful to optimize ASCII.
* | Mark PyUnicode_FromUCS[124] as privateVictor Stinner2011-09-283-3/+3
| |
* | Implement PEP 393.Martin v. Löwis2011-09-2815-1764/+386
| |
* | Issue #1621: Fix undefined behaviour from signed overflow in datetime module ↵Mark Dickinson2011-09-251-8/+6
| | | | | | | | hashes, array and list iterations, and get_integer (stringlib/string_format.h)