| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
just like string codecs.
|
|
|
|
|
|
| |
so it is only executed once. Otherwise the same search function is
repeated added to the codec search path when regrtest is run with -R
and leaks are reported.
|
|
|
|
|
| |
unicode instance if the argument is not an instance of basestring and
calling __str__ on the argument returns a unicode instance.
|
|
|
|
|
|
|
| |
conversion using the proper magic slot (e.g., __int__()). Also move conversion
code out of PyNumber_*() functions in the C API into the nb_* function.
Applied patch #1109424. Thanks Walter Doewald.
|
|
|
|
|
|
| |
it can be used for str and unicode. Drop the test for
"".join([s]) is s
because this is an implementation detail (and doesn't work for unicode)
|
|
|
|
|
|
|
|
|
|
|
|
| |
unicodedata.east_asian_width(). You can still implement your own
simple width() function using it like this:
def width(u):
w = 0
for c in unicodedata.normalize('NFC', u):
cwidth = unicodedata.east_asian_width(c)
if cwidth in ('W', 'F'): w += 2
else: w += 1
return w
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
iswide() for east asian width manipulation. (Inspired by David
Goodger, Reviewed by Martin v. Loewis)
- Move _PyUnicode_TypeRecord.flags to the end of the struct so that
no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
|
|
|
|
| |
characters instead of character pointers to determine space requirements.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
and test_support.run_classtests() into run_unittest()
and use it wherever possible.
Also don't use "from test.test_support import ...", but
"from test import test_support" in a few spots.
From SF patch #662807.
|
|
|
|
|
| |
an OverflowError instead of a TypeError to be consistent
with "%c" % 256. See SF patch #710127.
|
| |
|
|
|
|
|
|
|
|
| |
instead of raising a TypeError. (From SF patch #710127)
Add tests to verify this is fixed.
Add various tests for '%c' % int.
|
|
|
|
|
|
|
| |
between str, unicode, UserString and the string module
as possible. This increases code coverage in stringobject.c
from 83% to 86% and should help keep the string classes
in sync in the future. From SF patch #662807
|
|
|
|
| |
Object/unicodeobject.c::unicode_count().
|
| |
|
|
|
|
|
|
| |
cases and a few methods. This increases code coverage
in Objects/unicodeobject.c from 81% to 85%.
(From SF patch #662807)
|
|
|
|
| |
PyUnicode_EncodeDecimal().
|
|
|
|
|
| |
Python 2.2.x backport candidate. (This bug has been around since
Python 1.6.)
|
| |
|
|
|
|
| |
Python 2.2.3 candidate.
|
| |
|
|
|
|
| |
2.2.2 candidate.
|
|
|
|
|
|
|
|
|
|
| |
Unicode strings (with arbitrary length) are allowed
as entries in the unicode.translate mapping.
Add a test case for multicharacter replacements.
(Multicharacter replacements were enabled by the
PEP 293 patch)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wrong thing for a unicode subclass when there were zero string
replacements. The example given in the SF bug report was only one way
to trigger this; replacing a string of length >= 2 that's not found is
another. The code would actually write outside allocated memory if
replacement string was longer than the search string.
(I wonder how many more of these are lurking? The unicode code base
is full of wonders.)
Bugfix candidate; this same bug is present in 2.2.1.
|
|
|
|
|
| |
the string/unicode method .replace() with a zero-lengt first argument.
Inyeol contributed tests for this too.
|
|
|
|
|
|
| |
These were reported and fixed by Inyeol Lee in SF bug 595350. The
endswith() bug was already fixed in 2.3, but this adds some more test
cases.
|
|
|
|
|
|
|
| |
u'%c' will now raise a ValueError in case the argument is an
integer outside the valid range of Unicode code point ordinals.
Closes SF bug #593581.
|
|
|
|
| |
it does for 8-bit strings.
|
| |
|
|
|
|
| |
Py_UNICODE.
|
|
|
|
| |
string of longer than 1 character.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
imports e.g. test_support must do so using an absolute package name
such as "import test.test_support" or "from test import test_support".
This also updates the README in Lib/test, and gets rid of the
duplicate data dirctory in Lib/test/data (replaced by
Lib/email/test/data).
Now Tim and Jack can have at it. :)
|
| |
|
|
|
|
|
|
|
|
|
| |
http://www.python.org/sf/444708
This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
|
|
|
|
|
|
|
|
|
| |
If a str or unicode method returns the original object,
make sure that for str and unicode subclasses the original
will not be returned.
This should prevent SF bug http://www.python.org/sf/460020
from reappearing.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add a method zfill to str, unicode and UserString and change
Lib/string.py accordingly.
This activates the zfill version in unicodeobject.c that was
commented out and implements the same in stringobject.c. It also
adds the test for unicode support in Lib/string.py back in and
uses repr() instead() of str() (as it was before Lib/string.py 1.62)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PEP 285. Everything described in the PEP is here, and there is even
some documentation. I had to fix 12 unit tests; all but one of these
were printing Boolean outcomes that changed from 0/1 to False/True.
(The exception is test_unicode.py, which did a type(x) == type(y)
style comparison. I could've fixed that with a single line using
issubtype(x, type(y)), but instead chose to be explicit about those
places where a bool is expected.
Still to do: perhaps more documentation; change standard library
modules to return False/True from predicates.
|
| |
|
|
|
|
| |
is "ignore". Fixes #529104.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix for the UTF-8 decoder: it will now accept isolated surrogates
(previously it raised an exception which causes round-trips to
fail).
Added new tests for UTF-8 round-trip safety (we rely on UTF-8 for
marshalling Unicode objects, so we better make sure it works for
all Unicode code points, including isolated surrogates).
Bumped the PYC magic in a non-standard way -- please review. This
was needed because the old PYC format used illegal UTF-8 sequences
for isolated high surrogates which now raise an exception.
|