| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
(GH-114986)" (GH-116178)
Revert "gh-107674: Improve performance of `sys.settrace` (GH-114986)"
This reverts commit 0a61e237009bf6b833e13ac635299ee063377699.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
* Rename `_testinternalcapi.get_{uop,counter}_optimizer` to `new_*_optimizer`
* Use `_PyUOpName()` instead of` _PyOpcode_uop_name[]`
* Add `target` to executor iterator items -- `list(ex)` now returns `(opcode, oparg, target, operand)` quadruples
* Add executor methods `get_opcode()` and `get_oparg()` to get `vmdata.opcode`, `vmdata.oparg`
* Define a helper for printing uops, and unify various places where they are printed
* Add a hack to summarize_stats.py to fix legacy uop names (e.g. `POP_TOP` -> `_POP_TOP`)
* Define helpers in `test_opt.py` for accessing the set or list of opnames of an executor
|
|
|
| |
This fixes level 3 or higher lltrace debug output `--with-pydebug` runs.
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds an `eval_breaker` field to `PyThreadState`. The primary
motivation is for performance in free-threaded builds: with thread-local eval
breakers, we can stop a specific thread (e.g., for an async exception) without
interrupting other threads.
The source of truth for the global instrumentation version is stored in the
`instrumentation_version` field in PyInterpreterState. Threads usually read the
version from their local `eval_breaker`, where it continues to be colocated
with the eval breaker bits.
|
| |
|
|
|
|
| |
(GH-114142)
|
|
|
|
|
|
|
| |
Add an option (--enable-experimental-jit for configure-based builds
or --experimental-jit for PCbuild-based ones) to build an
*experimental* just-in-time compiler, based on copy-and-patch (https://fredrikbk.com/publications/copy-and-patch.pdf).
See Tools/jit/README.md for more information on how to install the required build-time tooling.
|
| |
|
|
|
|
| |
classes. (GH-113680)
|
| |
|
| |
|
|
|
|
|
|
| |
It was raised in two cases:
* in the import statement when looking up __import__
* in pickling some builtin type when looking up built-ins iter, getattr, etc.
|
|
|
|
|
|
| |
* Include destination T1 opcode in Error debug message
* Include destination T1 opcode in DEOPT debug message
* Remove obsolete comment from remove_unneeded_uops
* Change lltrace_instruction() to print caller's opcode/oparg
|
|
|
|
|
| |
Previously arbitrary errors could be cleared during formatting error
messages for ImportError or AttributeError for modules. Now all
unexpected errors are reported.
|
|
|
|
| |
* Rename _PyUopExecute to _PyUOpExecute (uppercase O) for consistency
* Also rename _PyUopName and _PyUOp_Replacements, and some output strings
|
|
|
|
| |
(#112216)
|
|
|
|
| |
This makes Windows about 3% faster on pyperformance benchmarks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the Tier 2 interpreter a little faster.
I calculated by about 3%,
though I hesitate to claim an exact number.
This starts by doubling the trace size limit (to 512),
making it more likely that loops fit in a trace.
The rest of the approach is to only load
`oparg` and `operand` in cases that use them.
The code generator know when these are used.
For `oparg`, it will conditionally emit
```
oparg = CURRENT_OPARG();
```
at the top of the case block.
(The `oparg` variable may be referenced multiple times
by the instructions code block, so it must be in a variable.)
For `operand`, it will use `CURRENT_OPERAND()` directly
instead of referencing the `operand` variable,
which no longer exists.
(There is only one place where this will be used.)
|
|
|
|
|
|
|
|
|
|
| |
This uses the new mechanism whereby certain uops
are replaced by others during translation,
using the `_PyUop_Replacements` table.
We further special-case the `_FOR_ITER_TIER_TWO` uop
to update the deoptimization target to point
just past the corresponding `END_FOR` opcode.
Two tiny code cleanups are also part of this PR.
|
|
|
| |
Rename Py_NOGIL to Py_GIL_DISABLED
|
|
|
|
|
| |
- Show uop name in Error/DEOPT messages
- Add target to some messages
- Expose uop_name() as _PyUopName()
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Replace jumps with deopts in tier 2
* Fewer special cases of uop names
* Add target field to uop IR
* Remove more redundant SET_IP and _CHECK_VALIDITY micro-ops
* Extend whitelist of non-escaping API functions.
|
| |
|
| |
|
|
|
|
|
|
|
| |
(#111794)
In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
|
| |
|
| |
|
|
|
|
| |
(#111459)
|
|
|
|
|
|
| |
Replace most of calls of _PyErr_WriteUnraisableMsg() and some
calls of PyErr_WriteUnraisable(NULL) with PyErr_FormatUnraisable().
Co-authored-by: Victor Stinner <vstinner@python.org>
|
|
|
|
|
|
|
|
|
|
|
| |
- There is no longer a separate Python/executor.c file.
- Conventions in Python/bytecodes.c are slightly different -- don't use `goto error`,
you must use `GOTO_ERROR(error)` (same for others like `unused_local_error`).
- The `TIER_ONE` and `TIER_TWO` symbols are only valid in the generated (.c.h) files.
- In Lib/test/support/__init__.py, `Py_C_RECURSION_LIMIT` is imported from `_testcapi`.
- On Windows, in debug mode, stack allocation grows from 8MiB to 12MiB.
- **Beware!** This changes the env vars to enable uops and their debugging
to `PYTHON_UOPS` and `PYTHON_LLTRACE`.
|
|
|
|
| |
instruction. (GH-111486)
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
`sys.monitoring.set_local_events()` (GH-108420)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove <ctype.h> in C files which don't use it; only sre.c and
_decimal.c still use it.
Remove _PY_PORT_CTYPE_UTF8_ISSUE code from pyport.h:
* Code added by commit b5047fd01948ab108edcc1b3c2c901d915814cfd
in 2004 for MacOSX and FreeBSD.
* Test removed by commit 52ddaefb6bab1a74ecffe8519c02735794ebfbe1
in 2007, since Python str type now uses locale independent
functions like Py_ISALPHA() and Py_TOLOWER() and the Unicode
database.
Modules/_sre/sre.c replaces _PY_PORT_CTYPE_UTF8_ISSUE with new
functions: sre_isalnum(), sre_tolower(), sre_toupper().
Remove unused includes:
* _localemodule.c: remove <stdio.h>.
* getargs.c: remove <float.h>.
* dynload_win.c: remove <direct.h>, it no longer calls _getcwd()
since commit fb1f68ed7cc1536482d1debd70a53c5442135fe2 (in 2001).
|
|
|
|
| |
`_POP_FRAME` op. (GH-108685)
|
|
|
|
|
|
| |
Change generated by the command:
sed -i -e 's!_PyLong_AsInt!PyLong_AsInt!g' \
$(find -name "*.c" -o -name "*.h")
|
|
|
|
| |
(#108367)
|
|
|
|
| |
arguments (#107969)
|
|
|
|
| |
This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
|
|
|
|
| |
performance. (GH-108036)
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Split `CALL_PY_EXACT_ARGS` into uops
This is only the first step for doing `CALL` in Tier 2.
The next step involves tracing into the called code object and back.
After that we'll have to do the remaining `CALL` specialization.
Finally we'll have to deal with `KW_NAMES`.
Note: this moves setting `frame->return_offset` directly in front of
`DISPATCH_INLINED()`, to make it easier to move it into `_PUSH_FRAME`.
|
| |
|
|
|
|
|
|
|
|
|
| |
- The `dump_stack()` method could call a `__repr__` method implemented in Python,
causing (infinite) recursion.
I rewrote it to only print out the values for some fundamental types (`int`, `str`, etc.);
for everything else it just prints `<type_name @ 0xdeadbeef>`.
- The lltrace-like feature for uops wrote to `stderr`, while the one in `ceval.c` writes to `stdout`;
I changed the uops to write to stdout as well.
|
| |
|