summaryrefslogtreecommitdiffstats
path: root/Python/optimizer.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-111848: Clean up RESERVE() macro (#112274)Guido van Rossum2023-11-201-14/+9
| | | Also avoid compiler warnings about unused 'reserved' variable.
* gh-106529: Make FOR_ITER a viable uop (#112134)Guido van Rossum2023-11-201-0/+6
| | | | | | | | | | This uses the new mechanism whereby certain uops are replaced by others during translation, using the `_PyUop_Replacements` table. We further special-case the `_FOR_ITER_TIER_TWO` uop to update the deoptimization target to point just past the corresponding `END_FOR` opcode. Two tiny code cleanups are also part of this PR.
* Various small improvements to uop debug output (#112218)Guido van Rossum2023-11-171-9/+9
| | | | | - Show uop name in Error/DEOPT messages - Add target to some messages - Expose uop_name() as _PyUopName()
* gh-106529: Cleanups split off gh-112134 (#112214)Guido van Rossum2023-11-171-1/+20
| | | | | | | - Double max trace size to 256 - Add a dependency on executor_cases.c.h for ceval.o - Mark `_SPECIALIZE_UNPACK_SEQUENCE` as `TIER_ONE_ONLY` - Add debug output back showing the optimized trace - Bunch of cleanups to Tools/cases_generator/
* GH-111848: Set the IP when de-optimizing (GH-112065)Mark Shannon2023-11-151-26/+22
| | | | | | | | | | | | * Replace jumps with deopts in tier 2 * Fewer special cases of uop names * Add target field to uop IR * Remove more redundant SET_IP and _CHECK_VALIDITY micro-ops * Extend whitelist of non-escaping API functions.
* GH-111848: Convert remaining jumps to deopts into tier 2 code. (GH-112045)Mark Shannon2023-11-141-39/+26
|
* GH-111843: Tier 2 exponential backoff (GH-111850)Mark Shannon2023-11-091-4/+5
|
* GH-109369: Exit tier 2 if executor is invalid (GH-111657)Mark Shannon2023-11-091-3/+4
|
* GH-111848: Tidy up tier 2 handling of FOR_ITER specialization by using ↵Mark Shannon2023-11-081-40/+11
| | | | DEOPT_IF instead of jumps. (GH-111849)
* GH-111646: Simplify optimizer, by compacting uops when making executor. ↵Mark Shannon2023-11-061-114/+87
| | | | (GH-111647)
* gh-111520: Integrate the Tier 2 interpreter in the Tier 1 interpreter (#111428)Guido van Rossum2023-11-011-6/+15
| | | | | | | | | | | - There is no longer a separate Python/executor.c file. - Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`, you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`). - The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files. - In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`. - On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB. - **Beware!** This changes the env vars to enable uops and their debugging to `PYTHON_UOPS` and `PYTHON_LLTRACE`.
* GH-111339: Fix initialization and finalization of static optimizer types ↵Savannah Ostrowski2023-10-291-14/+10
| | | | (GH-111430)
* gh-109094: replace frame->prev_instr by frame->instr_ptr (#109095)Irit Katriel2023-10-261-6/+4
|
* GH-111339: Change `valid` property of executors to `is_valid()` method ↵Mark Shannon2023-10-261-13/+13
| | | | (GH-111350)
* GH-109214: _SET_IP before _PUSH_FRAME (but not _POP_FRAME) (GH-111001)Brandt Bucher2023-10-241-8/+6
|
* GH-109369: Add machinery for deoptimizing tier2 executors, both individually ↵Mark Shannon2023-10-231-2/+233
| | | | and globally. (GH-110384)
* GH-109214: Convert _SAVE_CURRENT_IP to _SET_IP in tier 2 trace creation. ↵Mark Shannon2023-10-121-3/+5
| | | | (GH-110755)
* gh-109329: Add stat for "trace too short" (GH-110402)Michael Droettboom2023-10-051-0/+1
|
* GH-109329: Add tier 2 stats (GH-109913)Michael Droettboom2023-10-041-2/+11
|
* GH-104584: Don't call executors from JUMP_BACKWARD (GH-109347)Brandt Bucher2023-09-131-15/+6
|
* gh-109214: Rename SAVE_IP to _SET_IP, and similar (#109285)Guido van Rossum2023-09-111-30/+30
| | | | | | | | * Rename SAVE_IP to _SET_IP * Rename EXIT_TRACE to _EXIT_TRACE * Rename SAVE_CURRENT_IP to _SAVE_CURRENT_IP * Rename INSERT to _INSERT (This is for Ken Jin's abstract interpreter) * Rename IS_NONE to _IS_NONE * Rename JUMP_TO_TOP to _JUMP_TO_TOP
* gh-109039: Branch prediction for Tier 2 interpreter (#109038)Guido van Rossum2023-09-111-7/+21
| | | | | | | | | | | This adds a 16-bit inline cache entry to the conditional branch instructions POP_JUMP_IF_{FALSE,TRUE,NONE,NOT_NONE} and their instrumented variants, which is used to keep track of the branch direction. Each time we encounter these instructions we shift the cache entry left by one and set the bottom bit to whether we jumped. Then when it's time to translate such a branch to Tier 2 uops, we use the bit count from the cache entry to decided whether to continue translating the "didn't jump" branch or the "jumped" branch. The counter is initialized to a pattern of alternating ones and zeros to avoid bias. The .pyc file magic number is updated. There's a new test, some fixes for existing tests, and a few miscellaneous cleanups.
* GH-104584: Restore frame->stacktop on optimizer error (GH-108953)Brandt Bucher2023-09-061-0/+1
|
* gh-108765: Cleanup #include in Python/*.c files (#108977)Victor Stinner2023-09-061-3/+3
| | | Mention one symbol imported by each #include.
* gh-108727: Fix segfault due to missing tp_dealloc definition for ↵Irit Katriel2023-09-011-0/+1
| | | | CounterOptimizer_Type (GH-108734)
* gh-107557: Remove unnecessary SAVE_IP instructions (#108583)Guido van Rossum2023-08-291-24/+116
| | | | | Also remove NOP instructions. The "stubs" are not optimized in this fashion (their SAVE_IP should always be preserved since it's where to jump next, and they don't contain NOPs by their nature).
* gh-105481: remove regen-opcode. Generated _PyOpcode_Caches in regen-cases. ↵Irit Katriel2023-08-231-1/+0
| | | | (#108367)
* gh-106581: Project through calls (#108067)Guido van Rossum2023-08-171-1/+89
| | | | This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
* gh-106581: Split `CALL_PY_EXACT_ARGS` into uops (#107760)Guido van Rossum2023-08-161-0/+9
| | | | | | | | | | | | * Split `CALL_PY_EXACT_ARGS` into uops This is only the first step for doing `CALL` in Tier 2. The next step involves tracing into the called code object and back. After that we'll have to do the remaining `CALL` specialization. Finally we'll have to deal with `KW_NAMES`. Note: this moves setting `frame->return_offset` directly in front of `DISPATCH_INLINED()`, to make it easier to move it into `_PUSH_FRAME`.
* gh-107557: Setup abstract interpretation (#107847)Ken Jin2023-08-151-4/+13
| | | | Co-authored-by: Guido van Rossum <gvanrossum@users.noreply.github.com> Co-authored-by: Jules <57632293+juliapoo@users.noreply.github.com>
* gh-107758: Improvements to lltrace feature (#107757)Guido van Rossum2023-08-081-1/+1
| | | | | | | | | - The `dump_stack()` method could call a `__repr__` method implemented in Python, causing (infinite) recursion. I rewrote it to only print out the values for some fundamental types (`int`, `str`, etc.); for everything else it just prints `<type_name @ 0xdeadbeef>`. - The lltrace-like feature for uops wrote to `stderr`, while the one in `ceval.c` writes to `stdout`; I changed the uops to write to stdout as well.
* gh-106608: make uop trace variable length (#107531)Ivin Lee2023-08-051-13/+4
| | | Executors are now more like tuples.
* GH-104584: Fix incorrect uoperands (GH-107513)Brandt Bucher2023-07-311-1/+2
|
* GH-104584: Miscellaneous fixes for -Xuops (GH-106908)Brandt Bucher2023-07-201-0/+1
|
* gh-106603: Make uop struct a triple (opcode, oparg, operand) (#106794)Guido van Rossum2023-07-171-40/+51
|
* gh-106529: Generate uops for POP_JUMP_IF_[NOT_]NONE (#106796)Guido van Rossum2023-07-171-0/+17
| | | | | | | | These aren't automatically translated because (ironically) they are macros deferring to POP_JUMP_IF_{TRUE,FALSE}, which are not viable uops (being manually translated). The hack is that we emit IS_NONE and then set opcode and jump to the POP_JUMP_IF_{TRUE,FALSE} translation code.
* gh-106529: Split FOR_ITER_{LIST,TUPLE} into uops (#106696)Guido van Rossum2023-07-141-23/+64
| | | | Also rename `_ITER_EXHAUSTED_XXX` to `_IS_ITER_EXHAUSTED_XXX` to make it clear this is a test.
* gh-106529: Split FOR_ITER_RANGE into uops (#106638)Guido van Rossum2023-07-121-1/+23
| | | | For an example of what this does for Tier 1 and Tier 2, see https://github.com/python/cpython/issues/106529#issuecomment-1631649920
* gh-105481: move Python/opcode_metadata.h to ↵Irit Katriel2023-07-121-1/+1
| | | | Include/internal/pycore_opcode_metadata.h (#106673)
* gh-106529: Implement JUMP_FORWARD in uops (with test) (#106651)Guido van Rossum2023-07-111-0/+7
| | | | Note that this may generate two SAVE_IP uops in a row. Removing unneeded SAVE_IP uops is the optimizer's job.
* gh-104584: readability improvements in optimizer.c (#106641)Irit Katriel2023-07-111-18/+19
|
* gh-106529: Support JUMP_BACKWARD in Tier 2 (uops) (#106543)Guido van Rossum2023-07-111-2/+13
| | | | | During superblock generation, a JUMP_BACKWARD instruction is translated to either a JUMP_TO_TOP micro-op (when the target of the jump is exactly the beginning of the superblock, closing the loop), or a SAVE_IP + EXIT_TRACE pair, when the jump goes elsewhere. The new JUMP_TO_TOP instruction includes a CHECK_EVAL_BREAKER() call, so a closed loop can still be interrupted.
* gh-106529: Silence compiler warning in jump target patching (#106613)Guido van Rossum2023-07-111-2/+2
| | | (gh-106551 caused a compiler warning about on Windows.)
* gh-106529: Implement POP_JUMP_IF_XXX uops (#106551)Guido van Rossum2023-07-101-27/+84
| | | | | | | | | | | | | | | - Hand-written uops JUMP_IF_{TRUE,FALSE}. These peek at the top of the stack. The jump target (in superblock space) is absolute. - Hand-written translation for POP_JUMP_IF_{TRUE,FALSE}, assuming the jump is unlikely. Once we implement jump-likelihood profiling, we can implement the jump-unlikely case (in another PR). - Tests (including some test cleanup). - Improvements to len(ex) and ex[i] to expose the whole trace.
* gh-104584: Replace ENTER_EXECUTOR with the original in trace projection ↵Guido van Rossum2023-07-071-0/+6
| | | | (#106526)
* gh-104584: Move super-instruction special-casing to generator (#106500)Guido van Rossum2023-07-071-37/+16
| | | | | | Instead of special-casing specific instructions, we add a few more special values to the 'size' field of expansions, so in the future we can automatically handle additional super-instructions in the generator.
* gh-104584: Handle EXTENDED_ARG in superblock creation (#106489)Guido van Rossum2023-07-061-0/+16
| | | With test.
* gh-104584: Clean up and fix uops tests and fix crash (#106492)Guido van Rossum2023-07-061-2/+8
| | | | | | | | | | | | The uops test wasn't testing anything by default, and was failing when run with -Xuops. Made the two executor-related context managers global, so TestUops can use them (notably `with temporary_optimizer(opt)`). Made clear_executor() a little more thorough. Fixed a crash upon finalizing a uop optimizer, by adding a `tp_dealloc` handler.
* gh-104584: Fix error handling from backedge optimization (#106484)Guido van Rossum2023-07-061-7/+11
| | | | | | | | | | | | When `_PyOptimizer_BackEdge` returns `NULL`, we should restore `next_instr` (and `stack_pointer`). To accomplish this we should jump to `resume_with_error` instead of just `error`. The problem this causes is subtle -- the only repro I have is in PR gh-106393, at commit d7df54b139bcc47f5ea094bfaa9824f79bc45adc. But the fix is real (as shown later in that PR). While we're at it, also improve the debug output: the offsets at which traces are identified are now measured in bytes, and always show the start offset. This makes it easier to correlate executor calls with optimizer calls, and either with `dis` output. <!-- gh-issue-number: gh-104584 --> * Issue: gh-104584 <!-- /gh-issue-number -->
* GH-106360: Support very basic superblock introspection (#106422)Mark Shannon2023-07-041-0/+70
| | | * Add len() and indexing support to uop superblocks.