Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Issue #17237: Fix crash in the ASCII decoder on m68k. | Antoine Pitrou | 2013-05-11 | 1 | -0/+9 |
|\ | |||||
| * | Issue #17237: Fix crash in the ASCII decoder on m68k. | Antoine Pitrou | 2013-05-11 | 1 | -0/+9 |
| | | |||||
* | | Fix uninitialized value in charmap_decode_mapping() | Victor Stinner | 2013-05-06 | 1 | -1/+1 |
| | | |||||
* | | Issue #7330: Implement width and precision (ex: "%5.3s") for the format string | Victor Stinner | 2013-05-06 | 1 | -46/+109 |
| | | | | | | | | of PyUnicode_FromFormat() function, original patch written by Ysj Ray. | ||||
* | | Partial revert of changeset 9744b2df134c | Victor Stinner | 2013-04-18 | 1 | -5/+4 |
| | | | | | | | | | | PyUnicode_Append() cannot call directly resize_compact(): I forgot that a string can be ready *and* not compact (a legacy string can also be ready). | ||||
* | | Split PyUnicode_DecodeCharmap() into subfunction for readability | Victor Stinner | 2013-04-17 | 1 | -178/+213 |
| | | |||||
* | | Fix bug in Unicode decoders related to _PyUnicodeWriter | Victor Stinner | 2013-04-17 | 1 | -6/+14 |
| | | | | | | | | Bug introduced by changesets 7ed9993d53b4 and edf029fc9591. | ||||
* | | Fix typo in unicode_decode_call_errorhandler_writer() | Victor Stinner | 2013-04-17 | 1 | -1/+1 |
| | | | | | | | | Bug introduced by changeset 7ed9993d53b4. | ||||
* | | Close #17694: Add minimum length to _PyUnicodeWriter | Victor Stinner | 2013-04-17 | 1 | -54/+57 |
| | | | | | | | | | | | | | | | | | | | | | | | | * Add also min_char attribute to _PyUnicodeWriter structure (currently unused) * _PyUnicodeWriter_Init() has no more argument (except the writer itself): min_length and overallocate must be set explicitly * In error handlers, only enable overallocation if the replacement string is longer than 1 character * CJK decoders don't use overallocation anymore * Set min_length, instead of preallocating memory using _PyUnicodeWriter_Prepare(), in many decoders * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow | ||||
* | | Cleanup PyUnicode_Contains() | Victor Stinner | 2013-04-14 | 1 | -11/+6 |
| | | | | | | | | | | | | * No need to double-check that strings are ready: test already done by PyUnicode_FromObject() * Remove useless kind variable (use kind1 instead) | ||||
* | | Minor change: fix character in do_strip() for the ASCII case | Victor Stinner | 2013-04-14 | 1 | -2/+2 |
| | | |||||
* | | Cleanup PyUnicode_Append() | Victor Stinner | 2013-04-14 | 1 | -18/+14 |
| | | | | | | | | | | | | | | * Check also that right is a Unicode object * call directly resize_compact() instead of unicode_resize() for a more explicit error handling, and to avoid testing some properties twice (ex: unicode_modifiable()) | ||||
* | | PyUnicode_Join(): move use_memcpy test out of the loop to cleanup and ↵ | Victor Stinner | 2013-04-14 | 1 | -20/+28 |
| | | | | | | | | optimize the code | ||||
* | | Optimize repr(str): use _PyUnicode_FastCopyCharacters() when no character is ↵ | Victor Stinner | 2013-04-14 | 1 | -69/+78 |
| | | | | | | | | escaped | ||||
* | | Optimize ascii(str): don't encode/decode repr if repr is already ASCII | Victor Stinner | 2013-04-14 | 1 | -1/+1 |
| | | |||||
* | | Add _PyUnicodeWriter_WriteCharInline() | Victor Stinner | 2013-04-14 | 1 | -71/+35 |
| | | |||||
* | | Issue #16061: Speed up str.replace() for replacing 1-character strings. | Serhiy Storchaka | 2013-04-13 | 1 | -26/+38 |
| | | |||||
* | | Close #17693: Rewrite CJK decoders to use the _PyUnicodeWriter API instead of | Victor Stinner | 2013-04-11 | 1 | -0/+10 |
| | | | | | | | | | | | | the legacy Py_UNICODE API. Add also a new _PyUnicodeWriter_WriteChar() function. | ||||
* | | Issue #17615: On Windows (VS2010), Performances of wmemcmp() to compare Unicode | Victor Stinner | 2013-04-09 | 1 | -9/+0 |
| | | | | | | | | | | | | | | | | strings are not convincing. For UCS2 (16-bit wchar_t type), use a dummy loop instead of wmemcmp(). The dummy loop is as fast, or a little bit faster. wchar_t is only 16-bit long on Windows. wmemcmp() is still used for 32-bit wchar_t. | ||||
* | | replace(): only call PyUnicode_DATA(u) once | Victor Stinner | 2013-04-09 | 1 | -3/+4 |
| | | |||||
* | | Write super-fast version of str.strip(), str.lstrip() and str.rstrip() for ↵ | Victor Stinner | 2013-04-09 | 1 | -19/+45 |
| | | | | | | | | pure ASCII | ||||
* | | Don't calls macros in PyUnicode_WRITE() parameters | Victor Stinner | 2013-04-09 | 1 | -2/+10 |
| | | | | | | | | PyUnicode_WRITE() expands some parameters twice or more. | ||||
* | | Fix do_strip(): don't call PyUnicode_READ() in Py_UNICODE_ISSPACE() to not call | Victor Stinner | 2013-04-09 | 1 | -3/+10 |
| | | | | | | | | it twice | ||||
* | | Fix _PyUnicode_XStrip() | Victor Stinner | 2013-04-09 | 1 | -10/+18 |
| | | | | | | | | | | | | Inline the BLOOM_MEMBER() to only call PyUnicode_READ() only once (per loop iteration). Store also the length of the seperator in a variable to avoid calls to PyUnicode_GET_LENGTH(). | ||||
* | | Optimize PyUnicode_DecodeCharmap() | Victor Stinner | 2013-04-09 | 1 | -7/+9 |
| | | | | | | | | | | Avoid expensive PyUnicode_READ() and PyUnicode_WRITE(), manipulate pointers instead. | ||||
* | | Optimize make_bloom_mask(), used by str.strip(), str.lstrip() and str.rstrip() | Victor Stinner | 2013-04-09 | 1 | -5/+27 |
| | | | | | | | | | | Write specialized functions per Unicode kind to avoid the expensive PyUnicode_READ() macro. | ||||
* | | Use PyUnicode_READ() instead of PyUnicode_READ_CHAR() | Victor Stinner | 2013-04-09 | 1 | -6/+22 |
| | | | | | | | | | | "PyUnicode_READ_CHAR() is less efficient than PyUnicode_READ() because it calls PyUnicode_KIND() and might call it twice." according to its documentation. | ||||
* | | Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings: | Victor Stinner | 2013-04-09 | 1 | -1/+26 |
| | | | | | | | | cp037, cp500 and iso8859_1 codecs | ||||
* | | Issue #17615: Comparing two Unicode strings now uses wmemcmp() when possible | Victor Stinner | 2013-04-08 | 1 | -0/+22 |
| | | | | | | | | | | wmemcmp() is twice faster than a dummy loop (342 usec vs 744 usec) on Fedora 18/x86_64, GCC 4.7.2. | ||||
* | | Issue #17615: Expand expensive PyUnicode_READ() macro in unicode_compare(): | Victor Stinner | 2013-04-08 | 1 | -17/+77 |
| | | | | | | | | write specialized functions for each combination of Unicode kinds. | ||||
* | | fix unused variable | Victor Stinner | 2013-04-03 | 1 | -1/+0 |
| | | |||||
* | | Close #16757: Avoid calling the expensive _PyUnicode_FindMaxChar() function | Victor Stinner | 2013-04-03 | 1 | -7/+10 |
| | | | | | | | | when possible | ||||
* | | Add _PyUnicodeWriter_WriteSubstring() function | Victor Stinner | 2013-04-02 | 1 | -9/+39 |
| | | | | | | | | | | | | | | | | | | Write a function to enable more optimizations: * If the substring is the whole string and overallocation is disabled, just keep a reference to the string, don't copy characters * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when possible | ||||
* | | merge | Raymond Hettinger | 2013-03-23 | 1 | -1/+4 |
|\ \ | |/ | |||||
| * | Issue 17447: Clarify that str.isidentifier doesn't check for reserved keywords. | Raymond Hettinger | 2013-03-23 | 1 | -1/+4 |
| | | |||||
* | | (Merge 3.3) _PyUnicode_Writer() now also reuses Unicode singletons: | Victor Stinner | 2013-03-06 | 1 | -1/+1 |
|\ \ | |/ | | | | | empty string and latin1 single character | ||||
| * | _PyUnicode_Writer() now also reuses Unicode singletons: | Victor Stinner | 2013-03-06 | 1 | -1/+1 |
| | | | | | | | | empty string and latin1 single character | ||||
* | | Backed out changeset b9f7b1bf36aa | Victor Stinner | 2013-03-06 | 1 | -12/+7 |
| | | |||||
* | | Issue #17223: Fix PyUnicode_FromUnicode() on Windows (16-bit wchar_t type) | Victor Stinner | 2013-03-05 | 1 | -7/+12 |
| | | | | | | | | to reject invalid UTF-16 surrogate. | ||||
* | | (Merge 3.3) Issue #17223: Fix PyUnicode_FromUnicode() for string of 1 character | Victor Stinner | 2013-02-25 | 1 | -7/+7 |
|\ \ | |/ | | | | | outside the range U+0000-U+10ffff. | ||||
| * | Issue #17223: Fix PyUnicode_FromUnicode() for string of 1 character outside | Victor Stinner | 2013-02-25 | 1 | -7/+7 |
| | | | | | | | | the range U+0000-U+10ffff. | ||||
* | | (Merge 3.3) Issue #17137: When an Unicode string is resized, the internal wide | Victor Stinner | 2013-02-07 | 1 | -0/+4 |
|\ \ | |/ | | | | | character string (wstr) format is now cleared. | ||||
| * | Issue #17137: When an Unicode string is resized, the internal wide character | Victor Stinner | 2013-02-07 | 1 | -0/+4 |
| | | | | | | | | string (wstr) format is now cleared. | ||||
* | | Issue #17043: The unicode-internal decoder no longer read past the end of | Serhiy Storchaka | 2013-02-07 | 1 | -26/+22 |
|\ \ | |/ | | | | | input buffer. | ||||
| * | Issue #17043: The unicode-internal decoder no longer read past the end of | Serhiy Storchaka | 2013-02-07 | 1 | -26/+22 |
| |\ | | | | | | | | | | input buffer. | ||||
| | * | Issue #17043: The unicode-internal decoder no longer read past the end of | Serhiy Storchaka | 2013-02-07 | 1 | -27/+24 |
| | | | | | | | | | | | | input buffer. | ||||
* | | | Issue #16971: Fix a refleak in the charmap decoder. | Serhiy Storchaka | 2013-01-29 | 1 | -4/+12 |
|\ \ \ | |/ / | |||||
| * | | Issue #16971: Fix a refleak in the charmap decoder. | Serhiy Storchaka | 2013-01-29 | 1 | -4/+13 |
| | | | |||||
* | | | Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. | Serhiy Storchaka | 2013-01-29 | 1 | -51/+29 |
|\ \ \ | |/ / | |||||
| * | | Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. | Serhiy Storchaka | 2013-01-29 | 1 | -52/+30 |
| |\ \ | | |/ |