summaryrefslogtreecommitdiffstats
path: root/Python/optimizer.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-105481: remove regen-opcode. Generated _PyOpcode_Caches in regen-cases. ↵Irit Katriel2023-08-231-1/+0
| | | | (#108367)
* gh-106581: Project through calls (#108067)Guido van Rossum2023-08-171-1/+89
| | | | This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
* gh-106581: Split `CALL_PY_EXACT_ARGS` into uops (#107760)Guido van Rossum2023-08-161-0/+9
| | | | | | | | | | | | * Split `CALL_PY_EXACT_ARGS` into uops This is only the first step for doing `CALL` in Tier 2. The next step involves tracing into the called code object and back. After that we'll have to do the remaining `CALL` specialization. Finally we'll have to deal with `KW_NAMES`. Note: this moves setting `frame->return_offset` directly in front of `DISPATCH_INLINED()`, to make it easier to move it into `_PUSH_FRAME`.
* gh-107557: Setup abstract interpretation (#107847)Ken Jin2023-08-151-4/+13
| | | | Co-authored-by: Guido van Rossum <gvanrossum@users.noreply.github.com> Co-authored-by: Jules <57632293+juliapoo@users.noreply.github.com>
* gh-107758: Improvements to lltrace feature (#107757)Guido van Rossum2023-08-081-1/+1
| | | | | | | | | - The `dump_stack()` method could call a `__repr__` method implemented in Python, causing (infinite) recursion. I rewrote it to only print out the values for some fundamental types (`int`, `str`, etc.); for everything else it just prints `<type_name @ 0xdeadbeef>`. - The lltrace-like feature for uops wrote to `stderr`, while the one in `ceval.c` writes to `stdout`; I changed the uops to write to stdout as well.
* gh-106608: make uop trace variable length (#107531)Ivin Lee2023-08-051-13/+4
| | | Executors are now more like tuples.
* GH-104584: Fix incorrect uoperands (GH-107513)Brandt Bucher2023-07-311-1/+2
|
* GH-104584: Miscellaneous fixes for -Xuops (GH-106908)Brandt Bucher2023-07-201-0/+1
|
* gh-106603: Make uop struct a triple (opcode, oparg, operand) (#106794)Guido van Rossum2023-07-171-40/+51
|
* gh-106529: Generate uops for POP_JUMP_IF_[NOT_]NONE (#106796)Guido van Rossum2023-07-171-0/+17
| | | | | | | | These aren't automatically translated because (ironically) they are macros deferring to POP_JUMP_IF_{TRUE,FALSE}, which are not viable uops (being manually translated). The hack is that we emit IS_NONE and then set opcode and jump to the POP_JUMP_IF_{TRUE,FALSE} translation code.
* gh-106529: Split FOR_ITER_{LIST,TUPLE} into uops (#106696)Guido van Rossum2023-07-141-23/+64
| | | | Also rename `_ITER_EXHAUSTED_XXX` to `_IS_ITER_EXHAUSTED_XXX` to make it clear this is a test.
* gh-106529: Split FOR_ITER_RANGE into uops (#106638)Guido van Rossum2023-07-121-1/+23
| | | | For an example of what this does for Tier 1 and Tier 2, see https://github.com/python/cpython/issues/106529#issuecomment-1631649920
* gh-105481: move Python/opcode_metadata.h to ↵Irit Katriel2023-07-121-1/+1
| | | | Include/internal/pycore_opcode_metadata.h (#106673)
* gh-106529: Implement JUMP_FORWARD in uops (with test) (#106651)Guido van Rossum2023-07-111-0/+7
| | | | Note that this may generate two SAVE_IP uops in a row. Removing unneeded SAVE_IP uops is the optimizer's job.
* gh-104584: readability improvements in optimizer.c (#106641)Irit Katriel2023-07-111-18/+19
|
* gh-106529: Support JUMP_BACKWARD in Tier 2 (uops) (#106543)Guido van Rossum2023-07-111-2/+13
| | | | | During superblock generation, a JUMP_BACKWARD instruction is translated to either a JUMP_TO_TOP micro-op (when the target of the jump is exactly the beginning of the superblock, closing the loop), or a SAVE_IP + EXIT_TRACE pair, when the jump goes elsewhere. The new JUMP_TO_TOP instruction includes a CHECK_EVAL_BREAKER() call, so a closed loop can still be interrupted.
* gh-106529: Silence compiler warning in jump target patching (#106613)Guido van Rossum2023-07-111-2/+2
| | | (gh-106551 caused a compiler warning about on Windows.)
* gh-106529: Implement POP_JUMP_IF_XXX uops (#106551)Guido van Rossum2023-07-101-27/+84
| | | | | | | | | | | | | | | - Hand-written uops JUMP_IF_{TRUE,FALSE}. These peek at the top of the stack. The jump target (in superblock space) is absolute. - Hand-written translation for POP_JUMP_IF_{TRUE,FALSE}, assuming the jump is unlikely. Once we implement jump-likelihood profiling, we can implement the jump-unlikely case (in another PR). - Tests (including some test cleanup). - Improvements to len(ex) and ex[i] to expose the whole trace.
* gh-104584: Replace ENTER_EXECUTOR with the original in trace projection ↵Guido van Rossum2023-07-071-0/+6
| | | | (#106526)
* gh-104584: Move super-instruction special-casing to generator (#106500)Guido van Rossum2023-07-071-37/+16
| | | | | | Instead of special-casing specific instructions, we add a few more special values to the 'size' field of expansions, so in the future we can automatically handle additional super-instructions in the generator.
* gh-104584: Handle EXTENDED_ARG in superblock creation (#106489)Guido van Rossum2023-07-061-0/+16
| | | With test.
* gh-104584: Clean up and fix uops tests and fix crash (#106492)Guido van Rossum2023-07-061-2/+8
| | | | | | | | | | | | The uops test wasn't testing anything by default, and was failing when run with -Xuops. Made the two executor-related context managers global, so TestUops can use them (notably `with temporary_optimizer(opt)`). Made clear_executor() a little more thorough. Fixed a crash upon finalizing a uop optimizer, by adding a `tp_dealloc` handler.
* gh-104584: Fix error handling from backedge optimization (#106484)Guido van Rossum2023-07-061-7/+11
| | | | | | | | | | | | When `_PyOptimizer_BackEdge` returns `NULL`, we should restore `next_instr` (and `stack_pointer`). To accomplish this we should jump to `resume_with_error` instead of just `error`. The problem this causes is subtle -- the only repro I have is in PR gh-106393, at commit d7df54b139bcc47f5ea094bfaa9824f79bc45adc. But the fix is real (as shown later in that PR). While we're at it, also improve the debug output: the offsets at which traces are identified are now measured in bytes, and always show the start offset. This makes it easier to correlate executor calls with optimizer calls, and either with `dis` output. <!-- gh-issue-number: gh-104584 --> * Issue: gh-104584 <!-- /gh-issue-number -->
* GH-106360: Support very basic superblock introspection (#106422)Mark Shannon2023-07-041-0/+70
| | | * Add len() and indexing support to uop superblocks.
* gh-106290: Fix edge cases around uops (#106319)Guido van Rossum2023-07-031-60/+72
| | | | | | | | | | - Tweak uops debugging output - Fix the bug from gh-106290 - Rename `SET_IP` to `SAVE_IP` (per https://github.com/faster-cpython/ideas/issues/558) - Add a `SAVE_IP` uop at the start of the trace (ditto) - Allow `unbound_local_error`; this gives us uops for `LOAD_FAST_CHECK`, `LOAD_CLOSURE`, and `DELETE_FAST` - Longer traces - Support `STORE_FAST_LOAD_FAST`, `STORE_FAST_STORE_FAST` - Add deps on pycore_uops.h to Makefile(.pre.in)
* gh-106320: Use _PyInterpreterState_GET() (#106336)Victor Stinner2023-07-021-5/+4
| | | | Replace PyInterpreterState_Get() with inlined _PyInterpreterState_GET().
* gh-104584: Emit macro expansions to opcode_metadata.h (#106163)Guido van Rossum2023-06-281-3/+5
| | | | | | | This produces longer traces (superblocks?). Also improved debug output (uop names are now printed instead of numeric opcodes). This would be simpler if the numeric opcode values were generated by generate_cases.py, but that's another project. Refactored some code in generate_cases.py so the essential algorithm for cache effects is only run once. (Deciding which effects are used and what the total cache size is, regardless of what's used.)
* gh-104584: Baby steps towards generating and executing traces (#105924)Guido van Rossum2023-06-271-0/+199
| | | | | Added a new, experimental, tracing optimizer and interpreter (a.k.a. "tier 2"). This currently pessimizes, so don't use yet -- this is infrastructure so we can experiment with optimizing passes. To enable it, pass ``-Xuops`` or set ``PYTHONUOPS=1``. To get debug output, set ``PYTHONUOPSDEBUG=N`` where ``N`` is a debug level (0-4, where 0 is no debug output and 4 is excessively verbose). All of this code is likely to change dramatically before the 3.13 feature freeze. But this is a first step.
* GH-104584: Assorted fixes for the optimizer API. (GH-105683)Mark Shannon2023-06-191-28/+45
| | | | | | * Add test for long loops * Clear ENTER_EXECUTOR when deopting code objects.
* GH-100987: Allow objects other than code objects as the "executable" of an ↵Mark Shannon2023-06-141-3/+5
| | | | | | | | | | internal frame. (GH-105727) * Add table describing possible executable classes for out-of-process debuggers. * Remove shim code object creation code as it is no longer needed. * Make lltrace a bit more robust w.r.t. non-standard frames.
* GH-104584: Allow optimizers to opt out of optimizing. (GH-105244)Mark Shannon2023-06-051-11/+18
|
* GH-104584: Plugin optimizer API (GH-105100)Mark Shannon2023-06-021-0/+254