summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* Reverted changeset b72c5573c5e7 (issue #15027).Serhiy Storchaka2014-01-041-41/+61
|
* Issue #15027: Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.Serhiy Storchaka2014-01-041-61/+41
|
* Remove deadcode (HASH macro is no more defined)Victor Stinner2014-01-031-1/+0
|
* Remove now unused variablesVictor Stinner2014-01-031-5/+0
|
* unicode_char() uses get_latin1_char() to get latin1 singleton charactersVictor Stinner2014-01-031-0/+3
|
* add unicode_char() in unicodeobject.c to factorize codeVictor Stinner2014-01-031-55/+31
|
* Issue #19674: inspect.signature() now produces a correct signatureLarry Hastings2013-11-231-4/+7
| | | | for some builtins.
* Issue #19730: Argument Clinic now supports all the existing PyArgLarry Hastings2013-11-231-5/+5
| | | | | "format units" as legacy converters, as well as two new features: "self converters" and the "version" directive.
* Issue #19619: Blacklist non-text codecs in method APINick Coghlan2013-11-221-2/+2
| | | | | | | | | | str.encode, bytes.decode and bytearray.decode now use an internal API to throw LookupError for known non-text encodings, rather than attempting the encoding or decoding operation and then throwing a TypeError for an unexpected output type. The latter mechanism remains in place for third party non-text encodings.
* ssue #19183: Implement PEP 456 'secure and interchangeable hash algorithm'.Christian Heimes2013-11-201-33/+2
| | | | Python now uses SipHash24 on all major platforms.
* Add _PyUnicodeWriter_WriteASCIIString() functionVictor Stinner2013-11-191-18/+72
|
* Issue #12892: The utf-16* and utf-32* codecs now reject (lone) surrogates.Serhiy Storchaka2013-11-191-24/+221
| | | | | | | | | | The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
* Issue #19581: Change the overallocation factor of _PyUnicodeWriter on WindowsVictor Stinner2013-11-181-6/+17
| | | | On Windows, a factor of 50% gives best performances.
* Argument Clinic: rename "self" to "module" for module-level functions.Larry Hastings2013-11-181-1/+1
|
* #17806: Added keyword-argument support for "tabsize" to str/bytes.expandtabs().Ezio Melotti2013-11-161-5/+9
|
* Close #17828: better handling of codec errorsNick Coghlan2013-11-131-9/+18
| | | | | | | | - output type errors now redirect users to the type-neutral convenience functions in the codecs module - stateless errors that occur during encoding and decoding will now be automatically wrapped in exceptions that give the name of the codec involved
* _Py_normalize_encoding(): explain how the value 6 was computedVictor Stinner2013-11-071-0/+1
|
* Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8"Victor Stinner2013-11-071-0/+2
| | | | if the input string is NULL
* Issue #19512: add _PyUnicode_CompareWithId() functionVictor Stinner2013-11-061-0/+9
| | | | | | | _PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString() when both strings are equal and interned. Add also _PyId_builtins identifier for "builtins" common string.
* Issue #19424: PyUnicode_CompareWithASCIIString() normalizes memcmp() resultVictor Stinner2013-11-041-2/+6
| | | | to -1, 0, 1
* Issue #16286: remove duplicated identity check from unicode_compare()Victor Stinner2013-11-041-4/+5
| | | | Move the test to PyUnicode_Compare()
* Issue #16286: optimize PyUnicode_RichCompare() for identical strings (sameVictor Stinner2013-11-041-5/+19
| | | | | | pointer) for any operator, not only Py_EQ and Py_NE. Code of bytes_richcompare() and PyUnicode_RichCompare() is now closer.
* Issue #16286: write a new subfunction bytes_compare_eq()Victor Stinner2013-11-041-5/+3
| | | | | * cleanup bytes_richcompare() * PyUnicode_RichCompare(): replace a test with a XOR
* Issue #19424: Fix a compiler warning on comparing signed/unsigned size_tVictor Stinner2013-11-031-1/+1
| | | | Patch written by Zachary Ware.
* Issue #19424: Fix a compiler warningVictor Stinner2013-10-301-1/+1
| | | | memcmp() just takes raw pointers
* Issue #19424: Optimize PyUnicode_CompareWithASCIIString()Victor Stinner2013-10-291-13/+30
| | | | | Use fast memcmp() instead of a loop using the slow PyUnicode_READ() macro. strlen() is still necessary to check Unicode string containing null bytes.
* Issue #19437: Fix _PyUnicode_New() (constructor of legacy string), set allVictor Stinner2013-10-291-11/+14
| | | | | attributes before checking for error. The destructor expects all attributes to be set. It is now safe to call Py_DECREF(unicode) in the constructor.
* Issue #18609: Add a fast-path for "iso8859-1" encodingVictor Stinner2013-10-291-2/+4
| | | | | | | | On AIX, the locale encoding may be "iso8859-1", which was not a known syntax of the legacy ISO 8859-1 encoding. Using a C codec instead of a Python codec is faster but also avoids tricky issues during Python startup or complex code.
* Issue #18408: Fix PyUnicode_AsUTF8AndSize(), raise MemoryError exception onVictor Stinner2013-10-291-0/+1
| | | | memory allocation failure
* Issue #1772673: The type of `char*` arguments now changed to `const char*`.Serhiy Storchaka2013-10-191-1/+1
|
* Issue #19279: UTF-7 decoder no more produces illegal strings.Serhiy Storchaka2013-10-191-0/+2
|\
| * Issue #19279: UTF-7 decoder no more produces illegal strings.Serhiy Storchaka2013-10-191-0/+2
| |
* | Issue #16612: Add "Argument Clinic", a compile-time preprocessorLarry Hastings2013-10-191-17/+64
| | | | | | | | for C files to generate argument parsing code. (See PEP 436.)
* | Close #18780: %-formatting now prints value for int subclasses with %d, %i, ↵Ethan Furman2013-08-311-5/+3
| | | | | | | | and %u codes.
* | Issue #18722: Remove uses of the "register" keyword in C code.Antoine Pitrou2013-08-131-13/+13
| |
* | mergeRaymond Hettinger2013-08-041-1/+1
|\ \ | |/
| * Silence compiler warning about an uninitialized variableRaymond Hettinger2013-08-041-1/+1
| |
* | Check return value of PyType_Ready(&EncodingMapType)Christian Heimes2013-07-201-1/+2
|\ \ | |/ | | | | CID 486654
| * Check return value of PyType_Ready(&EncodingMapType)Christian Heimes2013-07-201-1/+2
| | | | | | | | CID 486654
* | Issue #18408: Don't check unicode consistency in _PyUnicode_HAS_UTF8_MEMORY()Victor Stinner2013-07-151-4/+2
| | | | | | | | | | | | | | | | | | | | | | and _PyUnicode_HAS_WSTR_MEMORY() macros These macros are called in unicode_dealloc(), whereas the unicode object can be "inconsistent" if the creation of the object failed. For example, when unicode_subtype_new() fails on a memory allocation, _PyUnicode_CheckConsistency() fails with an assertion error because data is NULL.
* | Issue #18408: _PyUnicodeWriter_Finish() now clears its buffer attribute in allVictor Stinner2013-07-081-3/+6
| | | | | | | | cases, so _PyUnicodeWriter_Dealloc() can be called after finish.
* | Issue #18408: Fix _PyUnicodeWriter_Finish(): clear writer->buffer,Victor Stinner2013-07-081-2/+5
| | | | | | | | so _PyUnicodeWriter_Dealloc() can be called on the writer after finish.
* | Issue #18203: Fix _Py_DecodeUTF8_surrogateescape(), use PyMem_RawMalloc() as ↵Victor Stinner2013-07-071-2/+2
| | | | | | | | _Py_char2wchar()
* | Issue #18203: Replace malloc() with PyMem_RawMalloc() at Python initializationVictor Stinner2013-07-071-3/+3
| | | | | | | | | | | | | | * Replace malloc() with PyMem_RawMalloc() * Replace PyMem_Malloc() with PyMem_RawMalloc() where the GIL is not held. * _Py_char2wchar() now returns a buffer allocated by PyMem_RawMalloc(), instead of PyMem_Malloc()
* | Fix ref leak in error case of unicode find, count, formatlongChristian Heimes2013-06-291-3/+11
| | | | | | | | | | | | CID 983315: Resource leak (RESOURCE_LEAK) CID 983316: Resource leak (RESOURCE_LEAK) CID 983317: Resource leak (RESOURCE_LEAK)
* | Fix ref leak in error case of unicode indexChristian Heimes2013-06-291-2/+6
| | | | | | | | | | CID 983319 (#1 of 2): Resource leak (RESOURCE_LEAK) leaked_storage: Variable substring going out of scope leaks the storage it points to.
* | Fix ref leak in error case of unicode rindex and rfindChristian Heimes2013-06-291-4/+12
| | | | | | | | | | | | CID 983320: Resource leak (RESOURCE_LEAK) CID 983321: Resource leak (RESOURCE_LEAK) leaked_storage: Variable substring going out of scope leaks the storage it points to.
* | Fix memory leak in endswithChristian Heimes2013-06-291-1/+1
| | | | | | | | | | CID 1040368 (#1 of 1): Resource leak (RESOURCE_LEAK) leaked_storage: Variable substring going out of scope leaks the storage it points to.
* | Issue #18184: PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raiseSerhiy Storchaka2013-06-231-1/+1
|\ \ | |/ | | | | OverflowError when an argument of %c format is out of range.
| * Issue #18184: PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raiseSerhiy Storchaka2013-06-231-2/+7
| | | | | | | | OverflowError when an argument of %c format is out of range.