| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a unified 16-bit backoff counter type (``_Py_BackoffCounter``),
shared between the Tier 1 adaptive specializer and the Tier 2 optimizer. The
API used for adaptive specialization counters is changed but the behavior is
(supposed to be) identical.
The behavior of the Tier 2 counters is changed:
- There are no longer dynamic thresholds (we never varied these).
- All counters now use the same exponential backoff.
- The counter for ``JUMP_BACKWARD`` starts counting down from 16.
- The ``temperature`` in side exits starts counting down from 64.
|
| |
|
|
|
| |
Use critical sections to lock around accesses to cell contents. The critical sections are no-ops in the default (with GIL) build.
|
| |
|
|
|
|
|
| |
Splits the "cold" path, deopts and exits, from the "hot" path, reducing the size of most jitted instructions, at the cost of slower exits.
|
| |
|
|
|
| |
Various tweaks, including a slight refactor of the special cases for `_PUSH_FRAME`/`_POP_FRAME` to show the actual operand emitted.
|
|
|
|
|
|
|
| |
(GH-114986)" (GH-116178)
Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)"
This reverts commit 0a61e237009bf6b833e13ac635299ee063377699.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
|
|
|
| |
This fixes level 3 or higher lltrace debug output `--with-pydebug` runs.
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
|
| |
|
|
|
|
| |
(GH-114142)
|
|
|
|
|
|
|
| |
Add an option (--enable-experimental-jit for configure-based builds
or --experimental-jit for PCbuild-based ones) to build an
*experimental* just-in-time compiler, based on copy-and-patch (https://fredrikbk.com/publications/copy-and-patch.pdf).
See Tools/jit/README.md for more information on how to install the required build-time tooling.
|
| |
|
|
|
|
| |
classes. (GH-113680)
|
| |
|
| |
|
|
|
|
|
|
| |
It was raised in two cases:
* in the import statement when looking up __import__
* in pickling some builtin type when looking up built-ins iter, getattr, etc.
|
|
|
|
|
|
| |
* Include destination T1 opcode in Error debug message
* Include destination T1 opcode in DEOPT debug message
* Remove obsolete comment from remove_unneeded_uops
* Change lltrace_instruction() to print caller's opcode/oparg
|
|
|
|
|
| |
Previously arbitrary errors could be cleared during formatting error
messages for ImportError or AttributeError for modules. Now all
unexpected errors are reported.
|
|
|
|
| |
* Rename _PyUopExecute to _PyUOpExecute (uppercase O) for consistency
* Also rename _PyUopName and _PyUOp_Replacements, and some output strings
|
|
|
|
| |
(#112216)
|
|
|
|
| |
This makes Windows about 3% faster on pyperformance benchmarks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.
This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.
The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.
For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)
For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
|
|
|
|
|
|
|
|
|
|
| |
This uses the new mechanism whereby certain uops
are replaced by others during translation,
using the `_PyUop_Replacements` table.
We further special-case the `_FOR_ITER_TIER_TWO` uop
to update the deoptimization target to point
just past the corresponding `END_FOR` opcode.
Two tiny code cleanups are also part of this PR.
|
|
|
| |
Rename Py_NOGIL to Py_GIL_DISABLED
|
|
|
|
|
| |
- Show uop name in Error/DEOPT messages
- Add target to some messages
- Expose uop_name() as _PyUopName()
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Replace jumps with deopts in tier 2
* Fewer special cases of uop names
* Add target field to uop IR
* Remove more redundant SET_IP and _CHECK_VALIDITY micro-ops
* Extend whitelist of non-escaping API functions.
|
| |
|
| |
|
|
|
|
|
|
|
| |
(#111794)
In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
|
| |
|
| |
|
|
|
|
| |
(#111459)
|
|
|
|
|
|
| |
Replace most of calls of _PyErr_WriteUnraisableMsg() and some
calls of PyErr_WriteUnraisable(NULL) with PyErr_FormatUnraisable().
Co-authored-by: Victor Stinner <vstinner@python.org>
|
|
|
|
|
|
|
|
|
|
|
| |
- There is no longer a separate Python/executor.c file.
- Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`,
you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`).
- The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files.
- In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`.
- On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB.
- **Beware!** This changes the env vars to enable uops and their debugging
to `PYTHON_UOPS` and `PYTHON_LLTRACE`.
|
|
|
|
| |
instruction. (GH-111486)
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
`sys.monitoring.set_local_events()` (GH-108420)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove <ctype.h> in C files which don't use it; only sre.c and
_decimal.c still use it.
Remove _PY_PORT_CTYPE_UTF8_ISSUE code from pyport.h:
* Code added by commit b5047fd01948ab108edcc1b3c2c901d915814cfd
in 2004 for MacOSX and FreeBSD.
* Test removed by commit 52ddaefb6bab1a74ecffe8519c02735794ebfbe1
in 2007, since Python str type now uses locale independent
functions like Py_ISALPHA() and Py_TOLOWER() and the Unicode
database.
Modules/_sre/sre.c replaces _PY_PORT_CTYPE_UTF8_ISSUE with new
functions: sre_isalnum(), sre_tolower(), sre_toupper().
Remove unused includes:
* _localemodule.c: remove <stdio.h>.
* getargs.c: remove <float.h>.
* dynload_win.c: remove <direct.h>, it no longer calls _getcwd()
since commit fb1f68ed7cc1536482d1debd70a53c5442135fe2 (in 2001).
|
|
|
|
| |
`_POP_FRAME` op. (GH-108685)
|