| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Simplified version of Neal Norwitz's patch which adds gotos for
opcodes that set "why". This skips a number of tests where the
outcome of the tests are known in advance.
|
| |
|
| |
|
|
|
|
|
| |
SF patch 825639
http://mail.python.org/pipermail/python-dev/2003-October/039445.html
|
|
|
|
|
|
| |
no-cyclic-comparison patch at the same time as errors.c.
Reverting ceval.c to the previous revision.
|
| |
|
|
|
|
|
|
| |
* Py_BuildValue("(OOO)",a,b,c) --> PyTuple_Pack(3,a,b,c)
* Py_BuildValue("()",a) --> PyTuple_New(0)
* Py_BuildValue("O", a) --> Py_INCREF(a)
|
|
|
|
| |
Will backport.
|
|
|
|
|
|
|
|
|
| |
A new API (only accessible from C) to interrupt a thread by sending it
an exception. This is not always effective, but might help some people.
Requested by Just van Rossum and Alex Martelli. It is intentional
that you have to write your own C extension to call it from Python.
Docs will have to wait.
|
|
|
|
| |
gives a small speedup).
|
|
|
|
|
| |
The fast_function() inlining optimization only
applies when there are zero keyword arguments.
|
|
|
|
| |
reported by Kurt B. Kaiser.
|
|
|
|
|
|
| |
[ 729622 ] line tracing hook errors
with massaging from me to integrate test into test suite.
|
|
|
|
| |
The additional code complexity and new NOP opcode were not worth it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Can now test for basic blocks.
* Optimize inverted comparisions.
* Optimize unary_not followed by a conditional jump.
* Added a new opcode, NOP, to keep code size constant.
* Applied NOP to previous transformations where appropriate.
Note, the NOP would not be necessary if other functions were
added to re-target jump addresses and update the co_lnotab mapping.
That would yield slightly faster and cleaner bytecode at the
expense of optimizer simplicity and of keeping it decoupled
from the line-numbering structure.
|
| |
|
|
|
|
|
|
| |
recursively.
- pdb has a new command, "debug", which lets you step through
arbitrary code from the debugger's (pdb) prompt.
|
|
|
|
|
|
|
|
| |
Added two predictions:
GET_ITER --> FOR_ITER
FOR_ITER --> STORE_FAST or UNPACK_SEQUENCE
Improves timings on pybench and timeit.py. Pystone results are neutral.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Applied to common cases:
COMPARE_OP is often followed by a JUMP_IF.
JUMP_IF is usually followed by POP_TOP.
Shows improved timings on PyStone, PyBench, and specific tests
using timeit.py:
python timeit.py -s "x=1" "if x==1: pass"
python timeit.py -s "x=1" "if x==2: pass"
python timeit.py -s "x=1" "if x: pass"
python timeit.py -s "x=100" "while x!=1: x-=1"
Potential future candidates:
GET_ITER predicts FOR_ITER
FOR_ITER predicts STORE_FAST or UNPACK_SEQUENCE
Also, applied missing goto fast_next_opcode to DUP_TOPX.
|
|
|
|
|
|
|
|
| |
My previous patches should have used fast_next_opcode
in a few places instead of continue.
Also, applied one PyInt_AS_LONG macro in a place where
the type had already been checked.
|
| |
|
|
|
|
|
|
|
| |
change _PyEval_SliceIndex to round massively negative longs up to
-INT_MAX, instead of 0 but botched it. Get it right.
Thx to Armin for the report.
|
|
|
|
|
| |
* List/Tuple checkexact is faster for the common case.
* Testing for Py_True and Py_False can be inlined for faster looping.
|
|
|
|
| |
instead of a plain PyObject *. (SF patch #686601 by Ben Laurie.)
|
|
|
|
| |
Incorporated nnorwitz's comment re. Py__USING_UNICODE.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-DCALL_PROFILE: Count the number of function calls executed.
When this symbol is defined, the ceval mainloop and helper functions
count the number of function calls made. It keeps detailed statistics
about what kind of object was called and whether the call hit any of
the special fast paths in the code.
Optimization:
When we take the fast_function() path, which seems to be taken for
most function calls, and there is minimal frame setup to do, avoid
call PyEval_EvalCodeEx(). The eval code ex function does a lot of
work to handle keywords args and star args, free variables,
generators, etc. The inlined version simply allocates the frame and
copies the arguments values into the frame.
The optimization gets a little help from compile.c which adds a
CO_NOFREE flag to code objects that don't have free variables or cell
variables. This change allows fast_function() to get into the fast
path with fewer tests.
I measure a couple of percent speedup in pystone with this change, but
there's surely more that can be done.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the code slightly shorter, faster, and easier to
read.
* Eliminate unused DUP_TOPX code for x==1.
compile.c always generates DUP_TOP instead.
* Since only two cases remain for DUP_TOPX, replace
the switch-case with if-elseif.
* The in-lined integer compare does a CheckExact on
both arguments. Since the second is a little more
likely to fail, test it first.
* The switch-case for IS/IS_NOT and IN/NOT_IN can
separate the regular and inverted cases with no
additional work. For all four paths, saves a test and
jump.
|
|
|
|
|
|
| |
The two are semantically equivalent, but the first triggered a compiler
warning about an unused variable. Note, the preceding steps had already
accessed and decreffed the variable so the reference counts were fine.
|
|
|
|
|
|
|
|
| |
parameter being either four or five. Currently, compile.c does not
generate calls with a parameter higher than three.
May have to be reverted if the second alpha or beta shakes out some
other tool generating this op code with a parameter of four or five.
|
|
|
|
|
| |
when a string exception is raised. Note that raising string exceptions
is deprecated in an exception message.
|
|
|
|
|
|
|
|
|
|
|
| |
Replaced groups of pushes and pops with indexed access to the stack and
a single adjustment (if needed) to the stacklevel.
Avoids scores of unnecessary increments and decrements to the stackpointer.
Removes unnecessary sequential dependencies so that the compiler has more
freedom for optimizations. Frees the processor for more parallel and
pipelined execution by using mostly read-only access and having few pointer
adjustments just prior to a read or write.
|
| |
|
|
|
|
|
|
|
| |
[ 631276 ] Exceptions raised by line trace function
It conflicted with the patches from Armin I just checked it, so I had
to so some bits by hand.
|
|
|
|
|
|
|
|
| |
[ 617309 ] getframe hook (Psyco #1)
[ 617311 ] Tiny profiling info (Psyco #2)
[ 617312 ] debugger-controlled jumps (Psyco #3)
These are forward ports from 2.2.2.
|
|
|
|
| |
Fixes a test failure on 64 bit platforms (I hope).
|
|
|
|
|
|
|
| |
all along. Before instr_lb tended to be too high.
I don't think this actually makes any difference, given what the compiler
produces, but it makes me a bit happier.
|
|
|
|
|
|
|
| |
patch #617312, both on the trunk and the 22-maint branch.
Also added a test case, and ported the test_trace I wrote for HEAD
to 2.2.2 (with all those horrible extra 'line' events ;-).
|
|
|
|
|
| |
This makes things a touch more like 2.2. Read the comments in
Python/ceval.c for more details.
|
|
|
|
|
|
| |
than when this interval was first established. Checking too frequently just
adds needless overhead because most of the time there is nothing to do and
no other threads ready to run.
|
|
|
|
|
|
|
|
|
|
| |
globals, _Py_Ticker and _Py_CheckInterval. This also implements Jeremy's
shortcut in Py_AddPendingCall that zeroes out _Py_Ticker. This allows the
test in the main loop to only test a single value.
The gory details are at
http://python.org/sf/602191
|
|
|
|
|
|
|
|
|
|
| |
Use a slightly different strategy to determine when not to call the line
trace function. This removes the need for the RETURN_NONE opcode, so
that's gone again. Update docs and comments to match.
Thanks to Neal and Armin!
Also add a test suite. This should have come with the original patch...
|
|
|
|
|
|
| |
required number of args is 0 or 1 -- were reversed. Also change "1"
into "exactly one", the same words as used elsewhere for this
condition.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in LOAD_GLOBAL. Besides saving a C function call, it saves checks
whether f_globals and f_builtins are dicts, and extracting and testing
the string object's hash code is done only once. We bail out of the
inlining if the name is not exactly a string, or when its hash is -1;
because of interning, neither should ever happen. I believe interning
guarantees that the hash code is set, and I believe that the 'names'
tuple of a code object always contains interned strings, but I'm not
assuming that -- I'm simply testing hash != -1.
On my home machine, this makes a pystone variant with new-style
classes and slots run at the same speed as classic pystone! (With
new-style classes but without slots, it is still a lot slower.)
|
|
|
|
|
|
|
|
|
| |
Also, don't handle METH_OLDARGS on the fast path. All the interesting
builtins have been converted to use METH_NOARGS, METH_O, or
METH_VARARGS.
Result is another 1-2% speedup. If I can cobble together 10 of these,
it might make a difference.
|
|
|
|
|
|
|
| |
This makes the code much easier to ready, because it is at a sane
indentation level. On my box this shows a 1-2% speedup, which means
nothing, except that I'm not going to worry about the performance
effects of the change.
|
|
|
|
|
|
|
| |
nothing special done if keyword arguments were present, so test for
that earlier and fall through to the normal case if there are any.
This ought to slow down CFunction calls with keyword args, but I don't
care; it's a tiny (1%) improvement for pystone.
|