| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
| |
Fix a nasty endcase reported by Armin Rigo in SF bug 618623:
'%2147483647d' % -123 segfaults. This was because an integer overflow
in a comparison caused the string resize to be skipped. After fixing
the overflow, this could call _PyString_Resize() with a negative size,
so I (1) test for that and raise MemoryError instead; (2) also added a
test for negative newsize to _PyString_Resize(), raising SystemError
as for all bad arguments.
An identical bug existed in unicodeobject.c, of course.
|
| |
|
|
|
|
|
|
|
|
|
| |
The string formatting code has a test to switch to Unicode when %s
sees a Unicode argument. Unfortunately this test was also executed
for %r, because %s and %r share almost all of their code. This meant
that, if u is a unicode object while repr(u) is an 8-bit string
containing ASCII characters, '%r' % u is a *unicode* string containing
only ASCII characters!
Fixed by executing the test only for %s.
|
| |
|
|
|
|
|
|
| |
unicodeobject.c 2.169
stringobject.c 2.189
Fix warnings on 64-bit platforms about casts from pointers to ints.
Two of these were real bugs.
|
| |
|
|
|
|
| |
These were reported and fixed by Inyeol Lee in SF bug 595350. The
endswith() bug is already fixed in 2.3; I'll fix the others in
2.3 next.
|
| |
|
|
| |
will trigger splitting on any whitespace.
|
| | |
|
| |
|
|
|
|
|
| |
Add #ifdef PY_USING_UNICODE sections, so that
stringobject.c compiles again with --disable-unicode.
Fixes SF bug http://www.python.org/sf/554912
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Repair widespread misuse of _PyString_Resize. Since it's clear people
don't understand how this function works, also beefed up the docs. The
most common usage error is of this form (often spread out across gotos):
if (_PyString_Resize(&s, n) < 0) {
Py_DECREF(s);
s = NULL;
goto outtahere;
}
The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL. So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s): s is already
NULL! A correct way to write the above is the simpler (and intended)
if (_PyString_Resize(&s, n) < 0)
goto outtahere;
Bugfix candidate.
Original patch(es):
python/dist/src/Objects/fileobject.c:2.161
python/dist/src/Objects/stringobject.c:2.161
python/dist/src/Objects/unicodeobject.c:2.147
|
| |
|
|
|
|
|
|
|
|
| |
Apply patch diff.txt from SF feature request
http://www.python.org/sf/444708
This adds the optional argument for str.strip
to unicode.strip too and makes it possible
to call str.strip with a unicode argument
and unicode.strip with a str argument.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Misc/NEWS 1.387->1.388
Lib/test/string_tests.py 1.10->1.11, 1.12->1.14,
Lib/test/test_unicode.py 1.50->1.51, 1.53->1.54, 1.55->1.56
Lib/test/test_string.py 1.15->1.16
Lib/string.py 1.61->1.63
Lib/test/test_userstring.py 1.5->1.6, 1.11, 1.12
Objects/stringobject.c 2.156->2.159
Objects/unicodeobject.c 2.137->2.139
Doc/lib/libstdtypes.tec 1.87->1.88
Add a method zfill to str, unicode and UserString
and change Lib/string.py accordingly
(see SF patch http://www.python.org/sf/536241)
This also adds Guido's fix to test_userstring.py
and the subinstance checks in test_string.py
and test_unicode.py.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Partially implement SF feature request 444708.
Add optional arg to string methods strip(), lstrip(), rstrip().
The optional arg specifies characters to delete.
Also for UserString.
Still to do:
- Misc/NEWS
- LaTeX docs (I did the docstrings though)
- Unicode methods, and Unicode support in the string methods.
Original patches were:
python/dist/src/Objects/stringobject.c:2.156
|
| |
|
|
|
|
| |
PyString_FromString():
Since the length of the string is already being stored in size,
changed the strcpy() to a memcpy() for a small speed improvement.
|
| |
|
|
| |
check it. Added an assert() to that effect.
|
| |
|
|
|
|
|
|
|
|
|
| |
Add a missing DECREF in an obscure corner. If the str() or repr() of
an object passed to a string interpolation -- e.g. "%s" % obj --
returns a non-string, the returned object was leaked.
Repair an indentation glitch.
Replace a bunch of PyString_AsString() calls (and their ilk) with
macros.
|
| | |
|
| |
|
|
| |
instead of PyOS_snprintf; add some relevant comments and asserts.
|
| | |
|
| |
|
|
|
| |
Also changed <>-style #includes to ""-style in some places where the
former didn't make sense.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
response to a message by Laura Creighton on c.l.py. E.g.
>>> 0+''
TypeError: unsupported operand types for +: 'int' and 'str'
(previously this did not mention the operand types)
>>> ''+0
TypeError: cannot concatenate 'str' and 'int' objects
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
object.c, PyObject_Str: Don't try to optimize anything except exact
string objects here; in particular, let str subclasses go thru tp_str,
same as non-str objects. This allows overrides of tp_str to take
effect.
stringobject.c:
+ string_print (str's tp_print): If the argument isn't an exact string
object, get one from PyObject_Str.
+ string_str (str's tp_str): Make a genuine-string copy of the object if
it's of a proper str subclass type. str() applied to a str subclass
that doesn't override __str__ ends up here.
test_descr.py: New str_of_str_subclass() test.
|
| |
|
|
| |
Reported by Neal Norwitz.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
many types were subclassable but had a xxx_dealloc function that
called PyObject_DEL(self) directly instead of deferring to
self->ob_type->tp_free(self). It is permissible to set tp_free in the
type object directly to _PyObject_Del, for non-GC types, or to
_PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster,
so I'm fearing that our pystone rating is going down again. I'm not
sure if doing something like
void xxx_dealloc(PyObject *self)
{
if (PyXxxCheckExact(self))
PyObject_DEL(self);
else
self->ob_type->tp_free(self);
}
is any faster than always calling the else branch, so I haven't
attempted that -- however those types whose own dealloc is fancier
(int, float, unicode) do use this pattern.
|
| |
|
|
|
|
|
| |
Unknown whether this fixes it.
- stringobject.c, PyString_FromFormatV: don't assume that va_list is of
a type that can be copied via an initializer.
- errors.c, PyErr_Format: add a va_end() to balance the va_start().
|
| | |
|
| |
|
|
|
| |
arguments are subclasses of str, as long as they don't override rich
comparison.
|
| |
|
|
|
|
| |
with the same value instead. This ensures that a string (or string
subclass) object's ob_sinterned pointer is always a str (or NULL), and
that the dict of interned strings only has strs as keys.
|
| |
|
|
|
|
|
|
| |
+ These were leaving the hash fields at 0, which all string and unicode
routines believe is a legitimate hash code. As a result, hash() applied
to str and unicode subclass instances always returned 0, which in turn
confused dict operations, etc.
+ Changed local names "new"; no point to antagonizing C++ compilers.
|
| |
|
|
|
|
|
|
|
| |
subclasses, all "the usual" ones (slicing etc), plus replace, translate,
ljust, rjust, center and strip. I don't know how to be sure they've all
been caught.
Question: Should we complain if someone tries to intern an instance of
a string subclass? I hate to slow any code on those paths.
|
| |
|
|
|
| |
Repaired str(i) to return a genuine string when i is an instance of a str
subclass. New PyString_CheckExact() macro.
|
| |
|
|
| |
xxx_subtype_new() functions are OK, but I goofed up in this one. :-( )
|
| | |
|
| |
|
|
|
|
|
|
|
| |
PyString_FromFormatV(): In the final resize at the end, we can use
PyString_AS_STRING() since we know the object is a string and can
avoid the typechecking.
PyString_FromFormat(): GS sez: "For safety/propriety, you should call
va_end() on the vargs variable."
|
| |
|
|
|
|
|
| |
at least in the first two characters. %p is ill-defined, and people will
forever commit bad tests otherwise ("bad" in the sense that they fall
over (at least on Windows) for lack of a leading '0x'; 5 of the 7 tests
in test_repr.py failed on Windows for that reason this time around).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PyErr_Format() these new C API methods can be used instead of
sprintf()'s into hardcoded char* buffers. This allows us to fix
many situation where long package, module, or class names get
truncated in reprs.
PyString_FromFormat() is the varargs variety.
PyString_FromFormatV() is the va_list variety
Original PyErr_Format() code was modified to allow %p and %ld
expansions.
Many reprs were converted to this, checkins coming soo. Not
changed: complex_repr(), float_repr(), float_print(), float_str(),
int_repr(). There may be other candidates not yet converted.
Closes patch #454743.
|
| |
|
|
|
|
|
|
| |
- Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled
- check for Py_USING_UNICODE in all places that use Unicode functions
- disables unicode literals, and the builtin functions
- add the types.StringTypes list
- remove Unicode literals from most tests.
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
And remove all the extern decls in the middle of .c files.
Apparently, it was excluded from the header file because it is
intended for internal use by the interpreter. It's still intended for
internal use and documented as such in the header file.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Gave Python linear-time repr() implementations for dicts, lists, strings.
This means, e.g., that repr(range(50000)) is no longer 50x slower than
pprint.pprint() in 2.2 <wink>.
I don't consider this a bugfix candidate, as it's a performance boost.
Added _PyString_Join() to the internal string API. If we want that in the
public API, fine, but then it requires runtime error checks instead of
asserts.
|
| | |
|
| |
|
|
| |
Use new _PyString_Eq in lookdict_string.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and introduces a new method .decode().
The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.
Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.
The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.
As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).
Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
|
| |
|
|
|
|
| |
out of synch than I realized, and I managed to break replace's "count"
argument when it was 0. All is well again. Maybe.
Bugfix candidate.
|
| |
|
|
|
|
| |
mymemXXX stuff, and they were already out of synch. Fix the remaining
bugs in both and get them back in synch.
Bugfix release candidate.
|
| |
|
|
| |
Thanks to Mark Favas.
|
| |
|
|
| |
restore correct semantics.
|
| | |
|
| | |
|
| |
|
|
|
| |
interned string created by "string"[i]. Since they're immortal anyway,
this was hard to notice, but it was still wrong <wink>.
|