| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Corrected the raw-unicode-escape codec to use UTF-16 surrogates in
this case, just like the unicode-escape codec.
|
| |
|
|
|
|
| |
case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Highlights:
- Adding PyObject_Format.
- Adding string.Format class.
- Adding __format__ for str, unicode, int, long, float, datetime.
- Adding builtin format.
- Adding ''.format and u''.format.
- str/unicode fixups for formatters.
The files in Objects/stringlib that implement PEP 3101 (stringdefs.h,
unicodedefs.h, formatter.h, string_format.h) are identical in trunk
and py3k. Any changes from here on should be made to trunk, and
changes will propogate to py3k).
|
| |
|
| |
|
|
|
|
|
|
|
| |
a large width is passed on 32-bit platforms. Found by Google.
It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
|
|
|
|
| |
run_unittest() and run_suite()). Also, add functionality to run_unittest() that admits usage of unittest.TestLoader.loadTestsFromModule().
|
|
|
|
|
|
| |
a unicode string in a build with wide unicode (UCS-4) support.
This code could be improved, so add an XXX comment.
|
| |
|
|
|
|
|
| |
On the way, add a decorator to test_support to facilitate running single
test functions in different locales with automatic cleanup.
|
| |
|
| |
|
|
|
|
| |
just like string codecs.
|
|
|
|
|
|
| |
so it is only executed once. Otherwise the same search function is
repeated added to the codec search path when regrtest is run with -R
and leaks are reported.
|
|
|
|
|
| |
unicode instance if the argument is not an instance of basestring and
calling __str__ on the argument returns a unicode instance.
|
|
|
|
|
|
|
| |
conversion using the proper magic slot (e.g., __int__()). Also move conversion
code out of PyNumber_*() functions in the C API into the nb_* function.
Applied patch #1109424. Thanks Walter Doewald.
|
|
|
|
|
|
| |
it can be used for str and unicode. Drop the test for
"".join([s]) is s
because this is an implementation detail (and doesn't work for unicode)
|
|
|
|
|
|
|
|
|
|
|
|
| |
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
|
|
|
|
| |
characters instead of character pointers to determine space requirements.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
and test_support.run_classtests() into run_unittest()
and use it wherever possible.
Also don't use "from test.test_support import ...", but
"from test import test_support" in a few spots.
From SF patch #662807.
|
|
|
|
|
| |
an OverflowError instead of a TypeError to be consistent
with "%c" % 256. See SF patch #710127.
|
| |
|
|
|
|
|
|
|
|
| |
instead of raising a TypeError. (From SF patch #710127)
Add tests to verify this is fixed.
Add various tests for '%c' % int.
|
|
|
|
|
|
|
| |
between str, unicode, UserString and the string module
as possible. This increases code coverage in stringobject.c
from 83% to 86% and should help keep the string classes
in sync in the future. From SF patch #662807
|
|
|
|
| |
Object/unicodeobject.c::unicode_count().
|
| |
|
|
|
|
|
|
| |
cases and a few methods. This increases code coverage
in Objects/unicodeobject.c from 81% to 85%.
(From SF patch #662807)
|
|
|
|
| |
PyUnicode_EncodeDecimal().
|
|
|
|
|
| |
Python 2.2.x backport candidate. (This bug has been around since
Python 1.6.)
|
| |
|
|
|
|
| |
Python 2.2.3 candidate.
|
| |
|
|
|
|
| |
2.2.2 candidate.
|
|
|
|
|
|
|
|
|
|
| |
Unicode strings (with arbitrary length) are allowed
as entries in the unicode.translate mapping.
Add a test case for multicharacter replacements.
(Multicharacter replacements were enabled by the
PEP 293 patch)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wrong thing for a unicode subclass when there were zero string
replacements. The example given in the SF bug report was only one way
to trigger this; replacing a string of length >= 2 that's not found is
another. The code would actually write outside allocated memory if
replacement string was longer than the search string.
(I wonder how many more of these are lurking? The unicode code base
is full of wonders.)
Bugfix candidate; this same bug is present in 2.2.1.
|
|
|
|
|
| |
the string/unicode method .replace() with a zero-lengt first argument.
Inyeol contributed tests for this too.
|
|
|
|
|
|
| |
These were reported and fixed by Inyeol Lee in SF bug 595350. The
endswith() bug was already fixed in 2.3, but this adds some more test
cases.
|
|
|
|
|
|
|
| |
u'%c' will now raise a ValueError in case the argument is an
integer outside the valid range of Unicode code point ordinals.
Closes SF bug #593581.
|
|
|
|
| |
it does for 8-bit strings.
|
| |
|
|
|
|
| |
Py_UNICODE.
|
|
|
|
| |
string of longer than 1 character.
|