summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* use PyMem_NEW to detect overflow (closes #23362)Benjamin Peterson2015-03-021-1/+1
|
* Issue #23055: Fixed a buffer overflow in PyUnicode_FromFormatV. AnalysisSerhiy Storchaka2015-01-271-0/+2
| | | | and fix by Guido Vranken.
* Fixed signed/unsigned comparison warningAntoine Pitrou2014-10-151-1/+1
|
* it suffices to check for PY_SSIZE_T_MAX overflow (#22643)Benjamin Peterson2014-10-151-3/+2
|
* make sure length is unsignedBenjamin Peterson2014-10-151-1/+1
|
* fix integer overflow in unicode case operations (closes #22643)Benjamin Peterson2014-10-151-0/+5
|
* prevent overflow in unicode_repr (closes #22520)Benjamin Peterson2014-09-301-11/+17
|
* cleanup overflowing handling in unicode_decode_call_errorhandler and ↵Benjamin Peterson2014-09-291-18/+56
| | | | unicode_encode_ucs1 (closes #22518)
* Make the various iterators' "setstate" sliently and consistently clip theKristján Valur Jónsson2014-03-051-3/+7
| | | | | index. This avoids the possibility of setting an iterator to an invalid state.
* Issue #19619: Blacklist non-text codecs in method APISerhiy Storchaka2014-02-241-2/+2
| | | | | | | | | | | | str.encode, bytes.decode and bytearray.decode now use an internal API to throw LookupError for known non-text encodings, rather than attempting the encoding or decoding operation and then throwing a TypeError for an unexpected output type. The latter mechanism remains in place for third party non-text encodings. Backported changeset d68df99d7a57.
* give non-iterable TypeError a message (closes #20507)Benjamin Peterson2014-02-151-1/+1
|
* Issue #20437: Fixed 21 potential bugs when deleting objects references.Serhiy Storchaka2014-02-091-8/+4
|
* Issue #20538: UTF-7 incremental decoder produced inconsistant string whenSerhiy Storchaka2014-02-081-1/+9
| | | | input was truncated in BASE64 section.
* Issue #19279: UTF-7 decoder no more produces illegal strings.Serhiy Storchaka2013-10-191-0/+2
|
* Silence compiler warning about an uninitialized variableRaymond Hettinger2013-08-041-1/+1
|
* Check return value of PyType_Ready(&EncodingMapType)Christian Heimes2013-07-201-1/+2
| | | | CID 486654
* Issue #18184: PyUnicode_FromFormat() and PyUnicode_FromFormatV() now raiseSerhiy Storchaka2013-06-231-2/+7
| | | | OverflowError when an argument of %c format is out of range.
* remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value ↵Benjamin Peterson2013-06-101-31/+26
| | | | (see #18183)
* Issue #17237: Fix crash in the ASCII decoder on m68k.Antoine Pitrou2013-05-111-0/+9
|
* Issue 17447: Clarify that str.isidentifier doesn't check for reserved keywords.Raymond Hettinger2013-03-231-1/+4
|
* _PyUnicode_Writer() now also reuses Unicode singletons:Victor Stinner2013-03-061-1/+1
| | | | empty string and latin1 single character
* Issue #17223: Fix PyUnicode_FromUnicode() for string of 1 character outsideVictor Stinner2013-02-251-7/+7
| | | | the range U+0000-U+10ffff.
* Issue #17137: When an Unicode string is resized, the internal wide characterVictor Stinner2013-02-071-0/+4
| | | | string (wstr) format is now cleared.
* Issue #17043: The unicode-internal decoder no longer read past the end ofSerhiy Storchaka2013-02-071-26/+22
|\ | | | | | | input buffer.
| * Issue #17043: The unicode-internal decoder no longer read past the end ofSerhiy Storchaka2013-02-071-27/+24
| | | | | | | | input buffer.
* | Issue #16971: Fix a refleak in the charmap decoder.Serhiy Storchaka2013-01-291-4/+13
| |
* | Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder.Serhiy Storchaka2013-01-291-52/+30
|\ \ | |/
| * Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder.Serhiy Storchaka2013-01-291-51/+28
| |
* | Issue #10156: In the interpreter's initialization phase, unicode globalsSerhiy Storchaka2013-01-261-90/+73
|\ \ | |/ | | | | are now initialized dynamically as needed.
| * Issue #10156: In the interpreter's initialization phase, unicode globalsSerhiy Storchaka2013-01-261-52/+45
| | | | | | | | are now initialized dynamically as needed.
* | Issue #16980: Fix processing of escaped non-ascii bytes in theSerhiy Storchaka2013-01-251-1/+1
| | | | | | | | unicode-escape-decode decoder.
* | Issue #16335: Fix integer overflow in unicode-escape decoder.Serhiy Storchaka2013-01-211-1/+2
|\ \ | |/
| * Issue #16335: Fix integer overflow in unicode-escape decoder.Serhiy Storchaka2013-01-211-1/+2
| |
* | Issue #15989: Fix several occurrences of integer overflowSerhiy Storchaka2013-01-191-2/+2
|\ \ | |/ | | | | | | | | when result of PyLong_AsLong() narrowed to int without checks. This is a backport of changesets 13e2e44db99d and 525407d89277.
| * Issue #15989: Fix several occurrences of integer overflowSerhiy Storchaka2013-01-191-2/+2
| | | | | | | | | | | | when result of PyLong_AsLong() narrowed to int without checks. This is a backport of changesets 13e2e44db99d and 525407d89277.
* | Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"Serhiy Storchaka2013-01-151-19/+22
|\ \ | |/ | | | | in any mapping, not only in an unicode string.
| * Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"Serhiy Storchaka2013-01-151-21/+25
| | | | | | | | in any mapping, not only in an unicode string.
* | correct static string clearing loop (closes #16906)Benjamin Peterson2013-01-091-6/+9
| |
* | Issue #11461: Fix the incremental UTF-16 decoder. Original patch bySerhiy Storchaka2013-01-081-1/+4
|\ \ | |/ | | | | | | Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP characters.
| * Issue #11461: Fix the incremental UTF-16 decoder. Original patch bySerhiy Storchaka2013-01-081-1/+4
| | | | | | | | | | Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP characters.
| * Fix out of bound read in UTF-32 decoder on "narrow Unicode" builds.Serhiy Storchaka2013-01-081-1/+1
| |
* | Issue #16856: Fix a segmentation fault from calling repr() on a dict withSerhiy Storchaka2013-01-041-1/+1
| | | | | | | | a key whose repr raise an exception.
* | (Merge 3.2) Issue #16455: On FreeBSD and Solaris, if the locale is C, theVictor Stinner2013-01-031-4/+4
|\ \ | |/ | | | | | | | | | | ASCII/surrogateescape codec is now used, instead of the locale encoding, to decode the command line arguments. This change fixes inconsistencies with os.fsencode() and os.fsdecode() because these operating systems announces an ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
| * Issue #16455: On FreeBSD and Solaris, if the locale is C, theVictor Stinner2013-01-031-4/+4
| | | | | | | | | | | | | | ASCII/surrogateescape codec is now used, instead of the locale encoding, to decode the command line arguments. This change fixes inconsistencies with os.fsencode() and os.fsdecode() because these operating systems announces an ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
* | Fix the internals of our hash functions to used unsigned values during hashGregory P. Smith2012-12-111-1/+1
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | computation as the overflow behavior of signed integers is undefined. NOTE: This change is smaller compared to 3.2 as much of this cleanup had already been done. I added the comment that my change in 3.2 added so that the code would match up. Otherwise this just adds or synchronizes appropriate UL designations on some constants to be pedantic. In practice we require compiling everything with -fwrapv which forces overflow to be defined as twos compliment but this keeps the code cleaner for checkers or in the case where someone has compiled it without -fwrapv or their compiler's equivalent. Found by Clang trunk's Undefined Behavior Sanitizer (UBSan). Cleanup only - no functionality or hash values change.
| * Fix the internals of our hash functions to used unsigned values during hashGregory P. Smith2012-12-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | computation as the overflow behavior of signed integers is undefined. In practice we require compiling everything with -fwrapv which forces overflow to be defined as twos compliment but this keeps the code cleaner for checkers or in the case where someone has compiled it without -fwrapv or their compiler's equivalent. Found by Clang trunk's Undefined Behavior Sanitizer (UBSan). Cleanup only - no functionality or hash values change.
* | (Merge 3.2) Issue #16416: On Mac OS X, operating system data are now alwaysVictor Stinner2012-12-031-4/+5
|\ \ | |/ | | | | | | | | | | encoded/decoded to/from UTF-8/surrogateescape, instead of the locale encoding (which may be ASCII if no locale environment variable is set), to avoid inconsistencies with os.fsencode() and os.fsdecode() functions which are already using UTF-8/surrogateescape.
| * Issue #16416: On Mac OS X, operating system data are now alwaysVictor Stinner2012-12-031-4/+5
| | | | | | | | | | | | | | encoded/decoded to/from UTF-8/surrogateescape, instead of the locale encoding (which may be ASCII if no locale environment variable is set), to avoid inconsistencies with os.fsencode() and os.fsdecode() functions which are already using UTF-8/surrogateescape.
* | Issue #16215: Fix potential double memory free in str.replace().Antoine Pitrou2012-11-171-0/+2
| | | | | | | | Patch by Serhiy Storchaka.
* | #8271: the utf-8 decoder now outputs the correct number of U+FFFD ↵Ezio Melotti2012-11-041-6/+4
| | | | | | | | characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.