summaryrefslogtreecommitdiffstats
path: root/Lib/test/test_unicode.py
Commit message (Collapse)AuthorAgeFilesLines
* Merged revisions 78758 via svnmerge fromEzio Melotti2010-08-021-1/+3
| | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r78758 | florent.xicluna | 2010-03-07 14:18:33 +0200 (Sun, 07 Mar 2010) | 4 lines Issue #7849: Now the utility ``check_warnings`` verifies if the warnings are effectively raised. A new utility ``check_py3k_warnings`` deals with py3k warnings. ........
* Merged revisions 82980 via svnmerge fromStefan Krah2010-07-191-0/+1
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/branches/release27-maint ........ r82980 | stefan.krah | 2010-07-19 20:06:46 +0200 (Mon, 19 Jul 2010) | 3 lines Sub-issue of #9036: Fix incorrect use of Py_CHARMASK. ........
* Merged revisions 81758-81759 via svnmerge fromEzio Melotti2010-07-031-0/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81758 | ezio.melotti | 2010-06-05 20:51:07 +0300 (Sat, 05 Jun 2010) | 15 lines Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629. 1) #8271: when a byte sequence is invalid, only the start byte and all the valid continuation bytes are now replaced by U+FFFD, instead of replacing the number of bytes specified by the start byte. See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95); 2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes in behavior); 3) Add code and tests to reject surrogates (U+D800-U+DFFF) as defined in RFC 3629, but leave it commented out since it's not backward compatible; 4) Change the error messages "unexpected code byte" to "invalid start byte" and "invalid data" to "invalid continuation byte"; 5) Add an extensive set of tests in test_unicode; 6) Fix test_codeccallbacks because it was failing after this change. ........ r81759 | ezio.melotti | 2010-06-05 22:21:32 +0300 (Sat, 05 Jun 2010) | 1 line Add a NEWS entry for r81758 and clarify a comment. ........
* Merged revisions 81825 via svnmerge fromBenjamin Peterson2010-06-071-3/+3
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81825 | benjamin.peterson | 2010-06-07 17:33:09 -0500 (Mon, 07 Jun 2010) | 1 line use unicode literals ........
* Merged revisions 81820 via svnmerge fromBenjamin Peterson2010-06-071-0/+3
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81820 | benjamin.peterson | 2010-06-07 17:23:23 -0500 (Mon, 07 Jun 2010) | 1 line correctly overflow when indexes are too large ........
* Merged revisions 79278,79280 via svnmerge fromVictor Stinner2010-03-221-0/+9
| | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r79278 | victor.stinner | 2010-03-22 13:24:37 +0100 (lun., 22 mars 2010) | 2 lines Issue #1583863: An unicode subclass can now override the __str__ method ........ r79280 | victor.stinner | 2010-03-22 13:36:28 +0100 (lun., 22 mars 2010) | 5 lines Fix the NEWS about my last commit: an unicode subclass can now override the __unicode__ method (and not the __str__ method). Simplify also the testcase. ........
* Merged revisions 78392 via svnmerge fromVictor Stinner2010-02-231-0/+13
| | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r78392 | victor.stinner | 2010-02-24 00:16:07 +0100 (mer., 24 févr. 2010) | 4 lines Issue #7649: Fix u'%c' % char for character in range 0x80..0xFF => raise an UnicodeDecodeError. Patch written by Ezio Melotti. ........
* Merged revisions 72848 via svnmerge fromEric Smith2009-05-231-0/+4
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r72848 | eric.smith | 2009-05-23 09:56:13 -0400 (Sat, 23 May 2009) | 1 line Issue 6089: str.format raises SystemError. ........
* Merged revisions 70368 via svnmerge fromEric Smith2009-03-141-51/+51
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r70368 | eric.smith | 2009-03-14 10:37:38 -0400 (Sat, 14 Mar 2009) | 1 line Unicode format tests weren't actually testing unicode. This was probably due to the original backport from py3k. ........
* #3601: test_unicode.test_raiseMemError fails in UCS4Antoine Pitrou2008-09-051-1/+4
| | | | Reviewed by Benjamin Peterson on IRC.
* #3556: test_raiseMemError consumes an insane amount of memoryAntoine Pitrou2008-08-171-8/+3
|
* Correct a crash when two successive unicode allocations fail with a MemoryError:Amaury Forgeot d'Arc2008-07-311-0/+14
| | | | | | | | | the freelist contained half-initialized objects with freed pointers. The comment /* XXX UNREF/NEWREF interface should be more symmetrical */ was copied from tupleobject.c, and appears in some other places. I sign the petition.
* #2242: utf7 decoding crashes on bogus input on some Windows/MSVC versionsAntoine Pitrou2008-07-251-0/+3
|
* #1477: ur'\U0010FFFF' raised in narrow unicode builds.Amaury Forgeot d'Arc2008-03-231-2/+15
| | | | | Corrected the raw-unicode-escape codec to use UTF-16 surrogates in this case, just like the unicode-escape codec.
* Patch #2167 from calvin: Remove unused importsChristian Heimes2008-02-231-1/+1
|
* Added code to correct combining str and unicode in ''.format(). Added test ↵Eric Smith2008-02-181-0/+9
| | | | case.
* Backport of PEP 3101, Advanced String Formatting, from py3k.Eric Smith2008-02-171-0/+262
| | | | | | | | | | | | | | | Highlights: - Adding PyObject_Format. - Adding string.Format class. - Adding __format__ for str, unicode, int, long, float, datetime. - Adding builtin format. - Adding ''.format and u''.format. - str/unicode fixups for formatters. The files in Objects/stringlib that implement PEP 3101 (stringdefs.h, unicodedefs.h, formatter.h, string_format.h) are identical in trunk and py3k. Any changes from here on should be made to trunk, and changes will propogate to py3k).
* Fix failing unicode test caused by change to ast.c at r56441Kurt B. Kaiser2007-07-181-3/+3
|
* Prevent these tests from running on Win64 since they don\'t apply there eitherNeal Norwitz2007-06-111-2/+2
|
* Prevent expandtabs() on string and unicode objects from causing a segfault whenNeal Norwitz2007-06-091-2/+7
| | | | | | | a large width is passed on 32-bit platforms. Found by Google. It would be good for people to review this especially carefully and verify I don't have an off by one error and there is no other way to cause overflow.
* Standardize on test.test_support.run_unittest() (as opposed to a mix of ↵Collin Winter2007-04-251-1/+1
| | | | run_unittest() and run_suite()). Also, add functionality to run_unittest() that admits usage of unittest.TestLoader.loadTestsFromModule().
* Patch #1541585: fix buffer overrun when performing repr() onNeal Norwitz2006-08-211-0/+4
| | | | | | a unicode string in a build with wide unicode (UCS-4) support. This code could be improved, so add an XXX comment.
* Whitespace normalization.Tim Peters2006-05-031-1/+1
|
* Bug #1473625: stop cPickle making float dumps locale dependent in protocol 0.Georg Brandl2006-04-301-13/+4
| | | | | On the way, add a decorator to test_support to facilitate running single test functions in different locales with automatic cleanup.
* Fixed bug #1459029 - unicode reprs were double-escaped.Anthony Baxter2006-03-301-0/+16
|
* Checkin the test of patch #1400181.Georg Brandl2006-01-201-0/+14
|
* Bug #1379994: Fix *unicode_escape codecs to encode r'\' as r'\\'Hye-Shik Chang2005-12-171-10/+14
| | | | just like string codecs.
* Move registration of the codec search function to the module scopeNeal Norwitz2005-11-241-17/+18
| | | | | | so it is only executed once. Otherwise the same search function is repeated added to the codec search path when regrtest is run with -R and leaks are reported.
* Change the %s format specifier for str objects so that it returns aNeil Schemenauer2005-08-121-0/+4
| | | | | unicode instance if the argument is not an instance of basestring and calling __str__ on the argument returns a unicode instance.
* Make subclasses of int, long, complex, float, and unicode perform typeBrett Cannon2005-04-261-1/+63
| | | | | | | conversion using the proper magic slot (e.g., __int__()). Also move conversion code out of PyNumber_*() functions in the C API into the nb_* function. Applied patch #1109424. Thanks Walter Doewald.
* Move test_bug1001011() to string_tests.MixinStrUnicodeTest so thatWalter Dörwald2004-08-261-1/+2
| | | | | | it can be used for str and unicode. Drop the test for "".join([s]) is s because this is an implementation detail (and doesn't work for unicode)
* SF #989185: Drop unicode.iswide() and unicode.width() and addHye-Shik Chang2004-08-041-2/+1
| | | | | | | | | | | | unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
* Let u'%s' % obj try obj.__unicode__() first and fallback to obj.__str__().Marc-André Lemburg2004-07-231-0/+8
|
* Reuse width/iswide tests from strings_test. (Suggested by Walter Dörwald)Hye-Shik Chang2004-06-041-21/+2
|
* Fix typo.Hye-Shik Chang2004-06-041-1/+1
|
* - SF #962502: Add two more methods for unicode type; width() andHye-Shik Chang2004-06-021-0/+20
| | | | | | | iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
* Fix reallocation bug in unicode.translate(): The code was comparingWalter Dörwald2004-02-051-0/+1
| | | | characters instead of character pointers to determine space requirements.
* Fix for SF bug [ 817156 ] invalid \U escape gives 0=length unistr.Jeremy Hylton2003-10-061-0/+7
|
* Support trailing dots in DNS names. Fixes #782510. Will backport to 2.3.Martin v. Löwis2003-08-051-0/+4
|
* Consider \U-escapes in raw-unicode-escape. Fixes #444514.Martin v. Löwis2003-05-181-0/+7
|
* Combine the functionality of test_support.run_unittest()Walter Dörwald2003-05-011-3/+1
| | | | | | | | | | and test_support.run_classtests() into run_unittest() and use it wherever possible. Also don't use "from test.test_support import ...", but "from test import test_support" in a few spots. From SF patch #662807.
* Change formatchar(), so that u"%c" % 0xffffffff now raisesWalter Dörwald2003-04-021-1/+1
| | | | | an OverflowError instead of a TypeError to be consistent with "%c" % 256. See SF patch #710127.
* Remove duplicate test.Walter Dörwald2003-03-311-2/+2
|
* Fix PyString_Format() so that '%c' % u'a' returns u'a'Walter Dörwald2003-03-311-0/+3
| | | | | | | | instead of raising a TypeError. (From SF patch #710127) Add tests to verify this is fixed. Add various tests for '%c' % int.
* Port all string tests to PyUnit and share as much testsWalter Dörwald2003-02-211-492/+132
| | | | | | | between str, unicode, UserString and the string module as possible. This increases code coverage in stringobject.c from 83% to 86% and should help keep the string classes in sync in the future. From SF patch #662807
* Add a few tests to test_count() to increase coverage inWalter Dörwald2003-02-101-0/+6
| | | | Object/unicodeobject.c::unicode_count().
* Fix copy&paste error: call title instead of countWalter Dörwald2003-02-101-1/+1
|
* Port test_unicode.py to PyUnit and add tests for errorWalter Dörwald2003-01-191-851/+1039
| | | | | | cases and a few methods. This increases code coverage in Objects/unicodeobject.c from 81% to 85%. (From SF patch #662807)
* Add a test that exercises the error handling part ofWalter Dörwald2003-01-081-0/+6
| | | | PyUnicode_EncodeDecimal().
* Patch for bug #659709: bogus computation of float lengthMarc-André Lemburg2002-12-291-0/+25
| | | | | Python 2.2.x backport candidate. (This bug has been around since Python 1.6.)