summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* Close #18780: %-formatting now prints value for int subclasses with %d, %i, ↵Ethan Furman2013-08-311-5/+3
| | | | and %u codes.
* Issue #18722: Remove uses of the "register" keyword in C code.Antoine Pitrou2013-08-131-13/+13
|
* mergeRaymond Hettinger2013-08-041-1/+1
|\
| * Silence compiler warning about an uninitialized variableRaymond Hettinger2013-08-041-1/+1
| |
* | Check return value of PyType_Ready(&EncodingMapType)Christian Heimes2013-07-201-1/+2
|\ \ | |/ | | | | CID 486654
| * Check return value of PyType_Ready(&EncodingMapType)Christian Heimes2013-07-201-1/+2
| | | | | | | | CID 486654
* | Issue #18408: Don't check unicode consistency in _PyUnicode_HAS_UTF8_MEMORY()Victor Stinner2013-07-151-4/+2
| | | | | | | | | | | | | | | | | | | | | | and _PyUnicode_HAS_WSTR_MEMORY() macros These macros are called in unicode_dealloc(), whereas the unicode object can be "inconsistent" if the creation of the object failed. For example, when unicode_subtype_new() fails on a memory allocation, _PyUnicode_CheckConsistency() fails with an assertion error because data is NULL.
* | Issue #18408: _PyUnicodeWriter_Finish() now clears its buffer attribute in allVictor Stinner2013-07-081-3/+6
| | | | | | | | cases, so _PyUnicodeWriter_Dealloc() can be called after finish.
* | Issue #18408: Fix _PyUnicodeWriter_Finish(): clear writer->buffer,Victor Stinner2013-07-081-2/+5
| | | | | | | | so _PyUnicodeWriter_Dealloc() can be called on the writer after finish.
* | Issue #18203: Fix _Py_DecodeUTF8_surrogateescape(), use PyMem_RawMalloc() as ↵Victor Stinner2013-07-071-2/+2
| | | | | | | | _Py_char2wchar()
* | Issue #18203: Replace malloc() with PyMem_RawMalloc() at Python initializationVictor Stinner2013-07-071-3/+3
| | | | | | | | | | | | | | * Replace malloc() with PyMem_RawMalloc() * Replace PyMem_Malloc() with PyMem_RawMalloc() where the GIL is not held. * _Py_char2wchar() now returns a buffer allocated by PyMem_RawMalloc(), instead of PyMem_Malloc()
* | Fix ref leak in error case of unicode find, count, formatlongChristian Heimes2013-06-291-3/+11
| | | | | | | | | | | | CID 983315: Resource leak (RESOURCE_LEAK) CID 983316: Resource leak (RESOURCE_LEAK) CID 983317: Resource leak (RESOURCE_LEAK)
* | Fix ref leak in error case of unicode indexChristian Heimes2013-06-291-2/+6
| | | | | | | | | | CID 983319 (#1 of 2): Resource leak (RESOURCE_LEAK) leaked_storage: Variable substring going out of scope leaks the storage it points to.
* | Fix ref leak in error case of unicode rindex and rfindChristian Heimes2013-06-291-4/+12
| | | | | | | | | | | | CID 983320: Resource leak (RESOURCE_LEAK) CID 983321: Resource leak (RESOURCE_LEAK) leaked_storage: Variable substring going out of scope leaks the storage it points to.
* | Fix memory leak in endswithChristian Heimes2013-06-291-1/+1
| | | | | | | | | | CID 1040368 (#1 of 1): Resource leak (RESOURCE_LEAK) leaked_storage: Variable substring going out of scope leaks the storage it points to.
* | Issue #18184: PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raiseSerhiy Storchaka2013-06-231-1/+1
|\ \ | |/ | | | | OverflowError when an argument of %c format is out of range.
| * Issue #18184: PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raiseSerhiy Storchaka2013-06-231-2/+7
| | | | | | | | OverflowError when an argument of %c format is out of range.
* | merge 3.3 (#18183)Benjamin Peterson2013-06-101-24/+19
|\ \ | |/
| * remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value ↵Benjamin Peterson2013-06-101-31/+26
| | | | | | | | (see #18183)
* | Issue #9566: Fix compiler warning on Windows 64-bitVictor Stinner2013-06-041-3/+5
| |
* | Issue #17237: Fix crash in the ASCII decoder on m68k.Antoine Pitrou2013-05-111-0/+9
|\ \ | |/
| * Issue #17237: Fix crash in the ASCII decoder on m68k.Antoine Pitrou2013-05-111-0/+9
| |
* | Fix uninitialized value in charmap_decode_mapping()Victor Stinner2013-05-061-1/+1
| |
* | Issue #7330: Implement width and precision (ex: "%5.3s") for the format stringVictor Stinner2013-05-061-46/+109
| | | | | | | | of PyUnicode_FromFormat() function, original patch written by Ysj Ray.
* | Partial revert of changeset 9744b2df134cVictor Stinner2013-04-181-5/+4
| | | | | | | | | | PyUnicode_Append() cannot call directly resize_compact(): I forgot that a string can be ready *and* not compact (a legacy string can also be ready).
* | Split PyUnicode_DecodeCharmap() into subfunction for readabilityVictor Stinner2013-04-171-178/+213
| |
* | Fix bug in Unicode decoders related to _PyUnicodeWriterVictor Stinner2013-04-171-6/+14
| | | | | | | | Bug introduced by changesets 7ed9993d53b4 and edf029fc9591.
* | Fix typo in unicode_decode_call_errorhandler_writer()Victor Stinner2013-04-171-1/+1
| | | | | | | | Bug introduced by changeset 7ed9993d53b4.
* | Close #17694: Add minimum length to _PyUnicodeWriterVictor Stinner2013-04-171-54/+57
| | | | | | | | | | | | | | | | | | | | | | | | * Add also min_char attribute to _PyUnicodeWriter structure (currently unused) * _PyUnicodeWriter_Init() has no more argument (except the writer itself): min_length and overallocate must be set explicitly * In error handlers, only enable overallocation if the replacement string is longer than 1 character * CJK decoders don't use overallocation anymore * Set min_length, instead of preallocating memory using _PyUnicodeWriter_Prepare(), in many decoders * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
* | Cleanup PyUnicode_Contains()Victor Stinner2013-04-141-11/+6
| | | | | | | | | | | | * No need to double-check that strings are ready: test already done by PyUnicode_FromObject() * Remove useless kind variable (use kind1 instead)
* | Minor change: fix character in do_strip() for the ASCII caseVictor Stinner2013-04-141-2/+2
| |
* | Cleanup PyUnicode_Append()Victor Stinner2013-04-141-18/+14
| | | | | | | | | | | | | | * Check also that right is a Unicode object * call directly resize_compact() instead of unicode_resize() for a more explicit error handling, and to avoid testing some properties twice (ex: unicode_modifiable())
* | PyUnicode_Join(): move use_memcpy test out of the loop to cleanup and ↵Victor Stinner2013-04-141-20/+28
| | | | | | | | optimize the code
* | Optimize repr(str): use _PyUnicode_FastCopyCharacters() when no character is ↵Victor Stinner2013-04-141-69/+78
| | | | | | | | escaped
* | Optimize ascii(str): don't encode/decode repr if repr is already ASCIIVictor Stinner2013-04-141-1/+1
| |
* | Add _PyUnicodeWriter_WriteCharInline()Victor Stinner2013-04-141-71/+35
| |
* | Issue #16061: Speed up str.replace() for replacing 1-character strings.Serhiy Storchaka2013-04-131-26/+38
| |
* | Close #17693: Rewrite CJK decoders to use the _PyUnicodeWriter API instead ofVictor Stinner2013-04-111-0/+10
| | | | | | | | | | | | the legacy Py_UNICODE API. Add also a new _PyUnicodeWriter_WriteChar() function.
* | Issue #17615: On Windows (VS2010), Performances of wmemcmp() to compare UnicodeVictor Stinner2013-04-091-9/+0
| | | | | | | | | | | | | | | | strings are not convincing. For UCS2 (16-bit wchar_t type), use a dummy loop instead of wmemcmp(). The dummy loop is as fast, or a little bit faster. wchar_t is only 16-bit long on Windows. wmemcmp() is still used for 32-bit wchar_t.
* | replace(): only call PyUnicode_DATA(u) onceVictor Stinner2013-04-091-3/+4
| |
* | Write super-fast version of str.strip(), str.lstrip() and str.rstrip() for ↵Victor Stinner2013-04-091-19/+45
| | | | | | | | pure ASCII
* | Don't calls macros in PyUnicode_WRITE() parametersVictor Stinner2013-04-091-2/+10
| | | | | | | | PyUnicode_WRITE() expands some parameters twice or more.
* | Fix do_strip(): don't call PyUnicode_READ() in Py_UNICODE_ISSPACE() to not callVictor Stinner2013-04-091-3/+10
| | | | | | | | it twice
* | Fix _PyUnicode_XStrip()Victor Stinner2013-04-091-10/+18
| | | | | | | | | | | | Inline the BLOOM_MEMBER() to only call PyUnicode_READ() only once (per loop iteration). Store also the length of the seperator in a variable to avoid calls to PyUnicode_GET_LENGTH().
* | Optimize PyUnicode_DecodeCharmap()Victor Stinner2013-04-091-7/+9
| | | | | | | | | | Avoid expensive PyUnicode_READ() and PyUnicode_WRITE(), manipulate pointers instead.
* | Optimize make_bloom_mask(), used by str.strip(), str.lstrip() and str.rstrip()Victor Stinner2013-04-091-5/+27
| | | | | | | | | | Write specialized functions per Unicode kind to avoid the expensive PyUnicode_READ() macro.
* | Use PyUnicode_READ() instead of PyUnicode_READ_CHAR()Victor Stinner2013-04-091-6/+22
| | | | | | | | | | "PyUnicode_READ_CHAR() is less efficient than PyUnicode_READ() because it calls PyUnicode_KIND() and might call it twice." according to its documentation.
* | Add fast-path in PyUnicode_DecodeCharmap() for pure 8 bit encodings:Victor Stinner2013-04-091-1/+26
| | | | | | | | cp037, cp500 and iso8859_1 codecs
* | Issue #17615: Comparing two Unicode strings now uses wmemcmp() when possibleVictor Stinner2013-04-081-0/+22
| | | | | | | | | | wmemcmp() is twice faster than a dummy loop (342 usec vs 744 usec) on Fedora 18/x86_64, GCC 4.7.2.
* | Issue #17615: Expand expensive PyUnicode_READ() macro in unicode_compare():Victor Stinner2013-04-081-17/+77
| | | | | | | | write specialized functions for each combination of Unicode kinds.