summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* On c.l.py, Martin v. Löwis said that Py_UNICODE could be of a signed type,Tim Peters2003-09-161-137/+145
| | | | | | | so fiddle Jeremy's fix to live with that. Also added more comments. Bugfix candidate (this bug is in all versions of Python, at least since 2.1).
* Double-fix of crash in Unicode freelist handling.Jeremy Hylton2003-09-161-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | If a length-1 Unicode string was in the freelist and it was uninitialized or pointed to a very large (magnitude) negative number, the check unicode_latin1[unicode->str[0]] == unicode could cause a segmentation violation, e.g. unicode->str[0] is 0xcbcbcbcb. Fix this in two ways: 1. Change guard befor unicode_latin1[] to test against 256U. If I understand correctly, the unsigned long used to store UCS4 on my box was getting converted to a signed long to compare with the signed constant 256. 2. Change _PyUnicode_New() to make sure the first element of str is always initialized to zero. There are several places in the code where the caller can exit with an error before initializing any of str, which would leave junk in str[0]. Also, silence a compiler warning on pointer vs. int arithmetic. Bug fix candidate.
* Change checks of PyUnicode_Resize() return value for clarity.Jeremy Hylton2003-09-161-18/+17
| | | | | | | The unicode_resize() family only returns -1 or 0 so simply checking for != 0 is sufficient, but somewhat unclear. Many Python API functions return < 0 on error, reserving the right to return 0 or 1 on success. Change the call sites for consistency with these calls.
* SF bug #795506: Wrong handling of string format code for float values.Raymond Hettinger2003-08-271-0/+3
| | | | | | Adding missing support for '%F'. Will backport to 2.3.1.
* Fix refcounting leak in charmaptranslate_lookup()Walter Dörwald2003-08-151-0/+1
|
* Fix another refcounting leak in PyUnicode_EncodeCharmap().Walter Dörwald2003-08-151-1/+3
|
* Fix another refcounting leak (in PyUnicode_DecodeUnicodeEscape()).Walter Dörwald2003-08-151-0/+2
|
* Fix refcount leak in PyUnicode_EncodeCharmap(). The bug surfacesWalter Dörwald2003-08-141-3/+3
| | | | | | | | | | when an encoding error occurs and the callback name is unknown, i.e. when the callback has to be called. The problem was that the fact that the callback has already been looked up was only recorded in a local variable in charmap_encoding_error(), because charmap_encoding_error() got it's own copy of the errorHandler pointer instead of a pointer to the pointer in PyUnicode_EncodeCharmap().
* Support 'mbcs' as a 'built-in' encoding, so the C API can use it withoutMark Hammond2003-07-011-0/+19
| | | | | defering to the encodings package. As described in [ 763111 ] mbcs encoding should skip encodings package
* SF patch 703666: Several objects don't decref tmp on failure in subtype_newRaymond Hettinger2003-06-281-1/+4
| | | | | | Submitted By: Christopher A. Craig Fillin some missing decrefs.
* Consider \U-escapes in raw-unicode-escape. Fixes #444514.Martin v. Löwis2003-05-181-3/+42
|
* Attempt to make all the various string *strip methods the same.Neal Norwitz2003-04-101-9/+9
| | | | | | | | | | | * Doc - add doc for when functions were added * UserString * string object methods * string module functions 'chars' is used for the last parameter everywhere. These changes will be backported, since part of the changes have already been made, but they were inconsistent.
* Reformat a few docstrings that caused line wraps in help() output.Guido van Rossum2003-04-091-6/+6
|
* Change formatchar(), so that u"%c" % 0xffffffff now raisesWalter Dörwald2003-04-021-2/+2
| | | | | an OverflowError instead of a TypeError to be consistent with "%c" % 256. See SF patch #710127.
* Sf patch #700047: unicode object leaks refcount on resizingRaymond Hettinger2003-03-091-0/+1
| | | | Contributed by Hye-Shik Chang.
* Add more missing PyErr_NoMemory() after failled memory allocsNeal Norwitz2003-02-111-1/+1
|
* Fix two refcounting bugsWalter Dörwald2003-02-091-2/+4
|
* Change the treatment of positions returned by PEP293Walter Dörwald2003-01-311-9/+17
| | | | | | | | | | | | | | | | error handers in the Unicode codecs: Negative positions are treated as being relative to the end of the input and out of bounds positions result in an IndexError. Also update the PEP and include an explanation of this in the documentation for codecs.register_error. Fixes a small bug in iconv_codecs: if the position from the callback is negative *add* it to the size instead of substracting it. From SF patch #677429.
* Implement appropriate __getnewargs__ for all immutable subclassable builtinGuido van Rossum2003-01-291-0/+9
| | | | | | | | types. The special handling for these can now be removed from save_newobj(). Add some testing for this. Also add support for setting the 'fast' flag on the Python Pickler class, which suppresses use of the memo.
* Fix charmapencode_lookup(), so that a None value in the mappingWalter Dörwald2003-01-081-0/+2
| | | | | is treated as "character maps to <undefined>" and not as "character mapping must return integer, None or str".
* Remove variable owned from PyUnicode_FromEncodedObject, which is unusedWalter Dörwald2003-01-081-7/+0
| | | | (except for Py_DECREF calls) since the introduction of __unicode__.
* Patch for bug #659709: bogus computation of float lengthMarc-André Lemburg2002-12-291-10/+21
| | | | | Python 2.2.x backport candidate. (This bug has been around since Python 1.6.)
* Add nb_remainder (i.e. __mod__) slot to unicode type. Fixes SF bug #615506.Neil Schemenauer2002-11-181-2/+21
|
* Fix SF # 635969, No error "not all arguments converted"Neal Norwitz2002-11-121-1/+2
| | | | | | | | | When mwh added extended slicing, strings and unicode became mappings. Thus, dict was set which prevented an error when doing: newstr = 'format without a percent' % string_value This fix raises an exception again when there are no formats and % with a string value.
* Fix for bug #626172: crash using unicode latin1 single charMarc-André Lemburg2002-10-231-3/+1
| | | | Python 2.2.3 candidate.
* Fix a nasty endcase reported by Armin Rigo in SF bug 618623:Guido van Rossum2002-10-111-2/+6
| | | | | | | | | | | | | '%2147483647d' % -123 segfaults. This was because an integer overflow in a comparison caused the string resize to be skipped. After fixing the overflow, this could call _PyString_Resize() with a negative size, so I (1) test for that and raise MemoryError instead; (2) also added a test for negative newsize to _PyString_Resize(), raising SystemError as for all bad arguments. An identical bug existed in unicodeobject.c, of course. Will backport to 2.2.2.
* Add cast to avoid compiler warning.Marc-André Lemburg2002-09-241-1/+1
|
* Fix part of SF bug # 544248 gcc warning in unicodeobject.cNeal Norwitz2002-09-131-1/+1
| | | | When --enable-unicode=ucs4, need to cast Py_UNICODE to a char
* Fix warnings on 64-bit platforms about casts from pointers to ints.Guido van Rossum2002-09-121-1/+2
| | | | Two of these were real bugs.
* Change the unicode.translate docstring to document thatWalter Dörwald2002-09-041-2/+3
| | | | | | | | | | Unicode strings (with arbitrary length) are allowed as entries in the unicode.translate mapping. Add a test case for multicharacter replacements. (Multicharacter replacements were enabled by the PEP 293 patch)
* PEP 293 implemention (from SF patch http://www.python.org/sf/432401)Walter Dörwald2002-09-021-552/+1240
|
* Fix SF bug 599128, submitted by Inyeol Lee: .replace() would do theGuido van Rossum2002-08-231-3/+9
| | | | | | | | | | | | | wrong thing for a unicode subclass when there were zero string replacements. The example given in the SF bug report was only one way to trigger this; replacing a string of length >= 2 that's not found is another. The code would actually write outside allocated memory if replacement string was longer than the search string. (I wonder how many more of these are lurking? The unicode code base is full of wonders.) Bugfix candidate; this same bug is present in 2.2.1.
* Code by Inyeol Lee, submitted to SF bug 595350, to implementGuido van Rossum2002-08-231-14/+20
| | | | | the string/unicode method .replace() with a zero-lengt first argument. Inyeol contributed tests for this too.
* Fix some endcase bugs in unicode rfind()/rindex() and endswith().Guido van Rossum2002-08-201-3/+3
| | | | | | These were reported and fixed by Inyeol Lee in SF bug 595350. The endswith() bug was already fixed in 2.3, but this adds some more test cases.
* More changes of DeprecationWarning to FutureWarning.Guido van Rossum2002-08-141-1/+1
|
* Add C API PyUnicode_FromOrdinal() which exposes unichr() at C level.Marc-André Lemburg2002-08-111-1/+55
| | | | | | | u'%c' will now raise a ValueError in case the argument is an integer outside the valid range of Unicode code point ordinals. Closes SF bug #593581.
* Implement stage B0 of PEP 237: add warnings for operations thatGuido van Rossum2002-08-111-0/+10
| | | | | | | | | | currently return inconsistent results for ints and longs; in particular: hex/oct/%u/%o/%x/%X of negative short ints, and x<<n that either loses bits or changes sign. (No warnings for repr() of a long, though that will also change to lose the trailing 'L' eventually.) This introduces some warnings in the test suite; I'll take care of those later.
* Unicode replace() method with empty pattern argument should fail, likeGuido van Rossum2002-08-091-0/+5
| | | | it does for 8-bit strings.
* PyUnicode_Contains(): The memcmp() call didn't take into account theBarry Warsaw2002-08-061-1/+1
| | | | width of Py_UNICODE. Good catch, MAL.
* Committing patch #591250 which provides "str1 in str2" when str1 is aBarry Warsaw2002-08-061-17/+23
| | | | string of longer than 1 character.
* tighten up the unicode object's docstring a tadSkip Montanaro2002-07-261-2/+2
|
* staticforward bites the dust.Jeremy Hylton2002-07-171-1/+1
| | | | | | | | | | | | | | | The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.
* Patch #569753: Remove support for WIN16.Martin v. Löwis2002-06-301-3/+3
| | | | Rename all occurrences of MS_WIN32 to MS_WINDOWS.
* Fix typo in exception messageNeal Norwitz2002-06-131-1/+1
|
* Patch #568124: Add doc string macros.Martin v. Löwis2002-06-131-74/+74
|
* This is my nearly two year old patchMichael W. Hudson2002-06-111-2/+54
| | | | | | | | | [ 400998 ] experimental support for extended slicing on lists somewhat spruced up and better tested than it was when I wrote it. Includes docs & tests. The whatsnew section needs expanding, and arrays should support extended slices -- later.
* Fix a possible segfault. Found be Neal Norvitz.Marc-André Lemburg2002-05-291-1/+1
|
* Fix for bug [ 561796 ] string.find causes lazy errorMarc-André Lemburg2002-05-291-2/+2
|
* - A new type object, 'string', is added. This is a common base typeGuido van Rossum2002-05-241-1/+3
| | | | | | | for 'str' and 'unicode', and can be used instead of types.StringTypes, e.g. to test whether something is "a string": isinstance(x, string) is True for Unicode and 8-bit strings. This is an abstract base class and cannot be instantiated directly.
* Patch 549187. Improve string formatting error message.Raymond Hettinger2002-05-211-2/+2
|