| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
as usual with slicing (both with str and unicode strings). This
fixes issue 1259.
For str only the stringobject.c file was modified. But for unicode,
I needed to repeat in the four functions a lot of code, so created
a new function that does part of the job for them (and placed it in
find.h, following a suggestion of Barry).
Also added tests for this behaviour.
|
| |
|
|
|
|
|
|
|
| |
also hex escapes) -- this was reaching beyond the end of the input string
buffer, even though it is not supposed to be \0-terminated.
This has no visible effect but is clearly the correct thing to do.
(In 3.0 it had a visible effect after removing ob_sstate from PyString.)
|
|
|
|
|
|
| |
PyObject_Print().
Closes issue #1164.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Specialcase extended slices that amount to a shallow copy the same way as
is done for simple slices, in the tuple, string and unicode case.
- Specialcase step-1 extended slices to optimize the common case for all
involved types.
- For lists, allow extended slice assignment of differing lengths as long
as the step is 1. (Previously, 'l[:2:1] = []' failed even though
'l[:2] = []' and 'l[:2:None] = []' do not.)
- Implement extended slicing for buffer, array, structseq, mmap and
UserString.UserString.
- Implement slice-object support (but not non-step-1 slice assignment) for
UserString.MutableString.
- Add tests for all new functionality.
|
|
|
|
| |
(backport)
|
|
|
|
|
| |
backwards compatibility. Add Py_Refcnt, Py_Type, Py_Size, and
PyVarObject_HEAD_INIT.
|
|
|
|
| |
with %G.
|
|
|
|
|
|
| |
This also catches another condition that can overflow.
Will backport.
|
|
|
|
|
|
|
| |
a large width is passed on 32-bit platforms. Found by Google.
It would be good for people to review this especially carefully and verify
I don't have an off by one error and there is no other way to cause overflow.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of some of the common builtin types.
Use a bit in tp_flags for each common builtin type. Check the bit
to determine if any instance is a subclass of these common types.
The check avoids a function call and O(n) search of the base classes.
The check is done in the various Py*_Check macros rather than calling
PyType_IsSubtype().
All the bits are set in tp_flags when the type is declared
in the Objects/*object.c files because PyType_Ready() is not called
for all the types. Should PyType_Ready() be called for all types?
If so and the change is made, the changes to the Objects/*object.c files
can be reverted (remove setting the tp_flags). Objects/typeobject.c
would also have to be modified to add conditions
for Py*_CheckExact() in addition to each the PyType_IsSubtype check.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* unified the way intobject, longobject and mystrtoul handle
values around -sys.maxint-1.
* in general, trying to entierely avoid overflows in any computation
involving signed ints or longs is extremely involved. Fixed a few
simple cases where a compiler might be too clever (but that's all
guesswork).
* more overflow checks against bad data in marshal.c.
* 2.5 specific: fixed a number of places that were still confusing int
and Py_ssize_t. Some of them could potentially have caused
"real-world" breakage.
* list.pop(x): fixing overflow issues on x was messy. I just reverted
to PyArg_ParseTuple("n"), which does the right thing. (An obscure
test was trying to give a Decimal to list.pop()... doesn't make
sense any more IMHO)
* trying to write a few tests...
|
| |
|
| |
|
|
|
|
|
|
| |
__oct__, __hex__ don't return a string.
Klocwork 308
|
|
|
|
|
|
|
| |
I modified this patch some by fixing style, some error checking, and adding
XXX comments. This patch requires review and some changes are to be expected.
I'm checking in now to get the greatest possible review and establish a
baseline for moving forward. I don't want this to hold up release if possible.
|
| |
|
|
|
|
| |
Pass the char* and size around rather than PyObject's.
|
|
|
|
| |
Bottom factor out some common code.
|
|
|
|
| |
Also improve error message on overflow.
|
|
|
|
| |
first argument.
|
|
|
|
|
| |
about extra semi-colons. It may have been the HP C compiler.
This file will trigger a bunch of those warnings now.
|
|
|
|
|
| |
and use it for string copy operations. this gives a 20% speedup on some
string benchmarks.
|
| |
|
|
|
|
| |
where appropriate
|
| |
|
| |
|
| |
|
|
|
|
| |
patterns in a string, only the number needed by the max limit.
|
| |
|
|
|
|
| |
find helpers; updated unicodeobject to use stringlib_count
|
|
|
|
|
|
|
|
|
|
| |
(If compiled without FAST search support, changed the pre-memcmp test
to check the last character as well as the first. This gave a 25%
speedup for my test case.)
Rewrote the split algorithms so they stop when maxsplit gets to 0.
Previously they did a string match first then checked if the maxsplit
was reached. The new way prevents a needless string search.
|
| |
|
|
|
|
| |
broken, someone would have noticed by now ;-)
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
the algorithm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
results list.
Originally it allocated 0 items and used the list growth during append. Now
it preallocates 12 items so the first few appends don't need list reallocs.
("Here are some words ."*2).split(None, 1) is 7% faster
("Here are some words ."*2).split() is is 15% faster
(Your milage may vary, see dealership for details.)
File parsing like this
for line in f:
count += len(line.split())
is also about 15% faster. There is a slowdown of about 3% for large
strings because of the additional overhead of checking if the append is
to a preallocated region of the list or not. This will be the rare case.
It could be improved with special case code but we decided it was not
useful enough.
There is a cost of 12*sizeof(PyObject *) bytes per list. For the normal
case of file parsing this is not a problem because of the lists have
a short lifetime. We have not come up with cases where this is a problem
in real life.
I chose 12 because human text averages about 11 words per line in books,
one of my data sets averages 6.2 words with a final peak at 11 words per
line, and I work with a tab delimited data set with 8 tabs per line (or
9 words per line). 12 encompasses all of these.
Also changed the last rstrip code to append then reverse, rather than
doing insert(0). The strip() and rstrip() times are now comparable.
|
| |
|
|
|
|
|
| |
length (thanks, neal!). and yes, I've verified that this doesn't
slow things down ;-)
|
|
|
|
|
| |
~15% faster for the current tests (which is noticable faster than a corre-
sponding find call). thanks to neal-who-never-sleeps for the tip.
|
|
|
|
| |
feel free to improve the documentation and the docstrings.
|
|
|
|
|
|
|
|
| |
this is on par with a corresponding find, and nearly twice as fast
as split(sep, 1)
full tests, a unicode version, and documentation will follow to-
morrow.
|