| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
support of calls. (GH-118322)
* Add CALL_PY_GENERAL, CALL_BOUND_METHOD_GENERAL and call CALL_NON_PY_GENERAL specializations.
* Remove CALL_PY_WITH_DEFAULTS specialization
* Use CALL_NON_PY_GENERAL in more cases when otherwise failing to specialize
|
|
|
|
|
|
| |
* Check tracing in RESUME_CHECK
* Only change to RESUME_CHECK if not tracing
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code for Tier 2 is now only compiled when configured
with `--enable-experimental-jit[=yes|interpreter]`.
We drop support for `PYTHON_UOPS` and -`Xuops`,
but you can disable the interpreter or JIT
at runtime by setting `PYTHON_JIT=0`.
You can also build it without enabling it by default
using `--enable-experimental-jit=yes-off`;
enable with `PYTHON_JIT=1`.
On Windows, the `build.bat` script supports
`--experimental-jit`, `--experimental-jit-off`,
`--experimental-interpreter`.
In the C code, `_Py_JIT` is defined as before
when the JIT is enabled; the new variable
`_Py_TIER2` is defined when the JIT *or* the
interpreter is enabled. It is actually a bitmask:
1: JIT; 2: default-off; 4: interpreter.
|
|
|
| |
Small TSAN fixups for instrumentation
|
|
|
|
| |
(GH-118279)
|
|
|
|
|
|
|
| |
(#116775)
Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe.
Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version. There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
|
|
|
|
|
|
|
| |
We were under-counting calls in `_PyEvalFramePushAndInit`
because the `CALL_STAT_INC` macro was redefined to a no-op
for the Tier 2 interpreter. The fix is not to `#undef` it at all.
This results in ~37% more "Frames pushed" reported
under "Call stats".
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.
The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
|
| |
|
|
|
| |
Use critical sections to lock around accesses to cell contents. The critical sections are no-ops in the default (with GIL) build.
|
| |
|
|
|
|
|
| |
Splits the "cold" path, deopts and exits, from the "hot" path, reducing the size of most jitted instructions, at the cost of slower exits.
|
| |
|
|
|
| |
Various tweaks, including a slight refactor of the special cases for `_PUSH_FRAME`/`_POP_FRAME` to show the actual operand emitted.
|
|
|
|
|
|
|
| |
(GH-114986)" (GH-116178)
Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)"
This reverts commit 0a61e237009bf6b833e13ac635299ee063377699.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
|
|
|
| |
This fixes level 3 or higher lltrace debug output `--with-pydebug` runs.
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
|
| |
|
|
|
|
| |
(GH-114142)
|
|
|
|
|
|
|
| |
Add an option (--enable-experimental-jit for configure-based builds
or --experimental-jit for PCbuild-based ones) to build an
*experimental* just-in-time compiler, based on copy-and-patch (https://fredrikbk.com/publications/copy-and-patch.pdf).
See Tools/jit/README.md for more information on how to install the required build-time tooling.
|
| |
|
|
|
|
| |
classes. (GH-113680)
|
| |
|
| |
|
|
|
|
|
|
| |
It was raised in two cases:
* in the import statement when looking up __import__
* in pickling some builtin type when looking up built-ins iter, getattr, etc.
|
|
|
|
|
|
| |
* Include destination T1 opcode in Error debug message
* Include destination T1 opcode in DEOPT debug message
* Remove obsolete comment from remove_unneeded_uops
* Change lltrace_instruction() to print caller's opcode/oparg
|
|
|
|
|
| |
Previously arbitrary errors could be cleared during formatting error
messages for ImportError or AttributeError for modules. Now all
unexpected errors are reported.
|
|
|
|
| |
* Rename _PyUopExecute to _PyUOpExecute (uppercase O) for consistency
* Also rename _PyUopName and _PyUOp_Replacements, and some output strings
|
|
|
|
| |
(#112216)
|
|
|
|
| |
This makes Windows about 3% faster on pyperformance benchmarks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.
This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.
The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.
For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)
For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
|
|
|
|
|
|
|
|
|
|
| |
This uses the new mechanism whereby certain uops
are replaced by others during translation,
using the `_PyUop_Replacements` table.
We further special-case the `_FOR_ITER_TIER_TWO` uop
to update the deoptimization target to point
just past the corresponding `END_FOR` opcode.
Two tiny code cleanups are also part of this PR.
|
|
|
| |
Rename Py_NOGIL to Py_GIL_DISABLED
|
|
|
|
|
| |
- Show uop name in Error/DEOPT messages
- Add target to some messages
- Expose uop_name() as _PyUopName()
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Replace jumps with deopts in tier 2
* Fewer special cases of uop names
* Add target field to uop IR
* Remove more redundant SET_IP and _CHECK_VALIDITY micro-ops
* Extend whitelist of non-escaping API functions.
|
| |
|
| |
|
|
|
|
|
|
|
| |
(#111794)
In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
|
| |
|
| |
|
|
|
|
| |
(#111459)
|
|
|
|
|
|
| |
Replace most of calls of _PyErr_WriteUnraisableMsg() and some
calls of PyErr_WriteUnraisable(NULL) with PyErr_FormatUnraisable().
Co-authored-by: Victor Stinner <vstinner@python.org>
|
|
|
|
|
|
|
|
|
|
|
| |
- There is no longer a separate Python/executor.c file.
- Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`,
you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`).
- The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files.
- In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`.
- On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB.
- **Beware!** This changes the env vars to enable uops and their debugging
to `PYTHON_UOPS` and `PYTHON_LLTRACE`.
|