summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* Issue #17043: The unicode-internal decoder no longer read past the end ofSerhiy Storchaka2013-02-071-27/+24
| | | | input buffer.
* Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder.Serhiy Storchaka2013-01-291-51/+28
|
* Issue #10156: In the interpreter's initialization phase, unicode globalsSerhiy Storchaka2013-01-261-52/+45
| | | | are now initialized dynamically as needed.
* Issue #16335: Fix integer overflow in unicode-escape decoder.Serhiy Storchaka2013-01-211-1/+2
|
* Issue #15989: Fix several occurrences of integer overflowSerhiy Storchaka2013-01-191-2/+2
| | | | | | when result of PyLong_AsLong() narrowed to int without checks. This is a backport of changesets 13e2e44db99d and 525407d89277.
* Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"Serhiy Storchaka2013-01-151-21/+25
| | | | in any mapping, not only in an unicode string.
* Issue #11461: Fix the incremental UTF-16 decoder. Original patch bySerhiy Storchaka2013-01-081-1/+4
| | | | | Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP characters.
* Fix out of bound read in UTF-32 decoder on "narrow Unicode" builds.Serhiy Storchaka2013-01-081-1/+1
|
* Issue #16455: On FreeBSD and Solaris, if the locale is C, theVictor Stinner2013-01-031-4/+4
| | | | | | | ASCII/surrogateescape codec is now used, instead of the locale encoding, to decode the command line arguments. This change fixes inconsistencies with os.fsencode() and os.fsdecode() because these operating systems announces an ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
* Fix the internals of our hash functions to used unsigned values during hashGregory P. Smith2012-12-111-1/+1
| | | | | | | | | | | | | computation as the overflow behavior of signed integers is undefined. In practice we require compiling everything with -fwrapv which forces overflow to be defined as twos compliment but this keeps the code cleaner for checkers or in the case where someone has compiled it without -fwrapv or their compiler's equivalent. Found by Clang trunk's Undefined Behavior Sanitizer (UBSan). Cleanup only - no functionality or hash values change.
* Issue #16416: On Mac OS X, operating system data are now alwaysVictor Stinner2012-12-031-4/+5
| | | | | | | encoded/decoded to/from UTF-8/surrogateescape, instead of the locale encoding (which may be ASCII if no locale environment variable is set), to avoid inconsistencies with os.fsencode() and os.fsdecode() functions which are already using UTF-8/surrogateescape.
* initialize more global type objects (closes #16369)Benjamin Peterson2012-10-311-0/+6
|
* Issue #14700: Fix buggy overflow checks for large precision and width in ↵Mark Dickinson2012-10-281-2/+2
| | | | new-style and old-style formatting.
* Issue #14783: Improve int() docstring and also str(), range(), and slice().Chris Jerdonek2012-10-071-1/+2
| | | | | | This commit rewrites the docstring for int() to incorporate the documentation changes made in issue #16036. It also switches the docstrings for int(), str(), range(), and slice() to use multi-line signatures.
* Issue #15379: Fix passing of non-BMP characters as integers for the charmap ↵Antoine Pitrou2012-09-231-2/+26
| | | | | | decoder (already working as unicode strings). Patch by Serhiy Storchaka.
* use the stricter PyMapping_Check (closes #15801)Benjamin Peterson2012-08-281-2/+1
|
* Fix str docstringNick Coghlan2012-08-161-4/+8
|
* Issue #14579: Fix CVE-2012-2135: vulnerability in the utf-16 decoder after ↵Antoine Pitrou2012-07-201-31/+21
| | | | | | error handling. Patch by Serhiy Storchaka.
* merge 3.1 (#14509)Benjamin Peterson2012-04-091-0/+2
|\
| * fix build without Py_DEBUG and DNDEBUG (closes #14509)Benjamin Peterson2012-04-091-0/+2
| |
* | kill this terribly outdated commentBenjamin Peterson2012-03-261-4/+0
| |
* | merge 3.2Benjamin Peterson2012-02-211-0/+1
|\ \ | |/
| * ensure no one tries to hash things before the random seed is foundBenjamin Peterson2012-02-211-0/+1
| |
* | Merge from 3.1: Issue #13703: add a way to randomize the hash values of ↵Georg Brandl2012-02-201-1/+11
|\ \ | |/ | | | | | | | | | | | | | | basic types (str, bytes, datetime) in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated. The environment variable PYTHONHASHSEED and the new command line flag -R control this behavior.
| * Issue #13703: add a way to randomize the hash values of basic types (str, ↵Georg Brandl2012-02-201-1/+11
| | | | | | | | | | | | | | | | | | bytes, datetime) in order to make algorithmic complexity attacks on (e.g.) web apps much more complicated. The environment variable PYTHONHASHSEED and the new command line flag -R control this behavior.
* | Issue #13913: normalize utf-8 codec name in UTF-8 decoderVictor Stinner2012-02-141-1/+1
| |
* | Issue #13848: open() and the FileIO constructor now check for NUL characters ↵Antoine Pitrou2012-01-291-0/+13
| | | | | | | | | | | | in the file name. Patch by Hynek Schlawack.
* | Consolidate the occurrances of the prime used as the multiplier when hashingGregory P. Smith2012-01-141-1/+1
| | | | | | | | | | | | | | to a single #define instead of having several copies in several files. This excludes the Modules/ tree (datetime and expat both have a copy for their own purposes with no need for it to be the same).
* | fix possible if unlikely leakBenjamin Peterson2011-12-201-1/+5
| |
* | Issue #13093: Fix error handling on PyUnicode_EncodeDecimal()Victor Stinner2011-11-221-6/+4
| | | | | | | | | | * Add tests for PyUnicode_EncodeDecimal() and PyUnicode_TransformDecimalToASCII() * Remove the unused "e" variable in replace()
* | Issue #13333: The UTF-7 decoder now accepts lone surrogatesAntoine Pitrou2011-11-151-9/+5
| | | | | | | | (the encoder already accepts them).
* | Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null characterVictor Stinner2011-09-061-5/+5
| | | | | | | | Fix also spelling of the null character.
* | #9200: The str.is* methods now work with strings that contain non-BMP ↵Ezio Melotti2011-08-221-41/+60
| | | | | | | | characters even in narrow Unicode builds.
* | the named of the character is actually NULBenjamin Peterson2011-08-181-1/+1
| |
* | NUL -> NULLBenjamin Peterson2011-08-181-1/+1
| |
* | #12266: Fix str.capitalize() to correctly uppercase/lowercase titlecased and ↵Ezio Melotti2011-08-151-2/+2
| | | | | | | | cased non-letter characters.
* | in narrow builds, make sure to test codepoints as identifier characters ↵Benjamin Peterson2011-08-131-8/+23
| | | | | | | | | | | | (closes #12732) This fixes the use of Unicode identifiers outside the BMP in narrow builds.
* | Fix closes Issue12621 - Fix docstrings of find and rfind methods of ↵Senthil Kumaran2011-07-271-2/+2
| | | | | | | | bytes/bytearry/unicodeobject.
* | Fix closes issue12471 - wrong TypeError message when '%i' format spec was used.Senthil Kumaran2011-07-041-3/+1
| |
* | Issue #10914: Initialize correctly the filesystem codec when creating a newVictor Stinner2011-04-261-7/+22
| | | | | | | | | | | | | | subinterpreter to fix a bootstrap issue with codecs implemented in Python, as the ISO-8859-15 codec. Add fscodec_initialized attribute to the PyInterpreterState structure.
* | #6780: merge with 3.1.Ezio Melotti2011-04-261-3/+10
|\ \ | |/
| * #6780: fix starts/endswith error message to mention that tuples are accepted ↵Ezio Melotti2011-04-261-7/+14
| | | | | | | | too.
* | MERGE: startswith and endswith don't accept None as slice index. Patch by ↵Jesus Cea2011-04-201-19/+16
|\ \ | |/ | | | | Torsten Becker. (closes #11828)
| * startswith and endswith don't accept None as slice index. Patch by Torsten ↵Jesus Cea2011-04-201-19/+16
| | | | | | | | Becker. (closes #11828)
| * Merged revisions 86277 via svnmerge fromEric Smith2010-11-061-2/+3
| | | | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r86277 | eric.smith | 2010-11-06 15:27:37 -0400 (Sat, 06 Nov 2010) | 1 line Added more to docstrings for str.format, format_map, and __format__. ........
| * Merged revisions 81936 via svnmerge fromGeorg Brandl2010-10-171-1/+2
| | | | | | | | | | | | | | | | | | | | svn+ssh://svn.python.org/python/branches/py3k ........ r81936 | mark.dickinson | 2010-06-12 11:10:14 +0200 (Sa, 12 Jun 2010) | 2 lines Silence 'unused variable' gcc warning. Patch by Éric Araujo. ........
| * Merged revisions 84394 via svnmerge fromAntoine Pitrou2010-09-011-27/+26
| | | | | | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r84394 | antoine.pitrou | 2010-09-01 17:10:12 +0200 (mer., 01 sept. 2010) | 4 lines Issue #7415: PyUnicode_FromEncodedObject() now uses the new buffer API properly. Patch by Stefan Behnel. ........
| * Merged revisions 83226-83227,83229-83232 via svnmerge fromGeorg Brandl2010-08-011-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | svn+ssh://svn.python.org/python/branches/py3k ........ r83226 | georg.brandl | 2010-07-29 16:17:12 +0200 (Do, 29 Jul 2010) | 1 line #1090076: explain the behavior of *vars* in get() better. ........ r83227 | georg.brandl | 2010-07-29 16:23:06 +0200 (Do, 29 Jul 2010) | 1 line Use Py_CLEAR(). ........ r83229 | georg.brandl | 2010-07-29 16:32:22 +0200 (Do, 29 Jul 2010) | 1 line #9407: document configparser.Error. ........ r83230 | georg.brandl | 2010-07-29 16:36:11 +0200 (Do, 29 Jul 2010) | 1 line Use correct directive and name. ........ r83231 | georg.brandl | 2010-07-29 16:46:07 +0200 (Do, 29 Jul 2010) | 1 line #9397: remove mention of dbm.bsd which does not exist anymore. ........ r83232 | georg.brandl | 2010-07-29 16:49:08 +0200 (Do, 29 Jul 2010) | 1 line #9388: remove ERA_YEAR which is never defined in the source code. ........
| * Recorded merge of revisions 83444 via svnmerge fromGeorg Brandl2010-08-011-2/+2
| | | | | | | | | | | | | | | | | | | | svn+ssh://svn.python.org/python/branches/py3k ........ r83444 | georg.brandl | 2010-08-01 22:51:02 +0200 (So, 01 Aug 2010) | 1 line Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway. ........
| * Merged revisions 83395,83417 via svnmerge fromGeorg Brandl2010-08-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | svn+ssh://svn.python.org/python/branches/py3k ........ r83395 | georg.brandl | 2010-08-01 10:49:18 +0200 (So, 01 Aug 2010) | 1 line #8821: do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character. ........ r83417 | georg.brandl | 2010-08-01 20:38:26 +0200 (So, 01 Aug 2010) | 1 line #5776: fix mistakes in python specfile. (Nobody probably uses it anyway.) ........