summaryrefslogtreecommitdiffstats
path: root/Objects/funcobject.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-139109: A new tracing JIT compiler frontend for CPython (GH-140310)Ken Jin2025-11-131-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR changes the current JIT model from trace projection to trace recording. Benchmarking: better pyperformance (about 1.7% overall) geomean versus current https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251108-3.15.0a1%2B-7e2bc1d-JIT/bm-20251108-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-7e2bc1d-vs-base.svg, 100% faster Richards on the most improved benchmark versus the current JIT. Slowdown of about 10-15% on the worst benchmark versus the current JIT. **Note: the fastest version isn't the one merged, as it relies on fixing bugs in the specializing interpreter, which is left to another PR**. The speedup in the merged version is about 1.1%. https://raw.githubusercontent.com/facebookexperimental/free-threading-benchmarking/refs/heads/main/results/bm-20251112-3.15.0a1%2B-f8a764a-JIT/bm-20251112-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-f8a764a-vs-base.svg Stats: 50% more uops executed, 30% more traces entered the last time we ran them. It also suggests our trace lengths for a real trace recording JIT are too short, as a lot of trace too long aborts https://github.com/facebookexperimental/free-threading-benchmarking/blob/main/results/bm-20251023-3.15.0a1%2B-eb73378-CLANG%2CJIT/bm-20251023-vultr-x86_64-Fidget%252dSpinner-tracing_jit-3.15.0a1%2B-eb73378-pystats-vs-base.md . This new JIT frontend is already able to record/execute significantly more instructions than the previous JIT frontend. In this PR, we are now able to record through custom dunders, simple object creation, generators, etc. None of these were done by the old JIT frontend. Some custom dunders uops were discovered to be broken as part of this work gh-140277 The optimizer stack space check is disabled, as it's no longer valid to deal with underflow. Pros: * Ignoring the generated tracer code as it's automatically created, this is only additional 1k lines of code. The maintenance burden is handled by the DSL and code generator. * `optimizer.c` is now significantly simpler, as we don't have to do strange things to recover the bytecode from a trace. * The new JIT frontend is able to handle a lot more control-flow than the old one. * Tracing is very low overhead. We use the tail calling interpreter/computed goto interpreter to switch between tracing mode and non-tracing mode. I call this mechanism dual dispatch, as we have two dispatch tables dispatching to each other. Specialization is still enabled while tracing. * Better handling of polymorphism. We leverage the specializing interpreter for this. Cons: * (For now) requires tail calling interpreter or computed gotos. This means no Windows JIT for now :(. Not to fret, tail calling is coming soon to Windows though https://github.com/python/cpython/pull/139962 Design: * After each instruction, the `record_previous_inst` function/label is executed. This does as the name suggests. * The tracing interpreter lowers bytecode to uops directly so that it can obtain "fresh" values at the point of lowering. * The tracing version behaves nearly identical to the normal interpreter, in fact it even has specialization! This allows it to run without much of a slowdown when tracing. The actual cost of tracing is only a function call and writes to memory. * The tracing interpreter uses the specializing interpreter's deopt to naturally form the side exit chains. This allows it to side exit chain effectively, without repeating much code. We force a re-specializing when tracing a deopt. * The tracing interpreter can even handle goto errors/exceptions, but I chose to disable them for now as it's not tested. * Because we do not share interpreter dispatch, there is should be no significant slowdown to the original specializing interpreter on tailcall and computed got with JIT disabled. With JIT enabled, there might be a slowdown in the form of the JIT trying to trace. * Things that could have dynamic instruction pointer effects are guarded on. The guard deopts to a new instruction --- `_DYNAMIC_EXIT`.
* gh-139924: Add PyFunction_PYFUNC_EVENT_MODIFY_QUALNAME event for function ↵Dino Viehland2025-10-101-0/+2
| | | | | watchers (#139925) Add PyFunction_PYFUNC_EVENT_MODIFY_QUALNAME event for function watchers
* gh-130821: Add type information to error messages for invalid return type ↵Semyon Moroz2025-08-141-2/+3
| | | | (GH-130835)
* gh-135607: remove null checking of weakref list in dealloc of extension ↵Xuanteng Huang2025-06-301-3/+2
| | | | | | modules and objects (#135614) Co-authored-by: Kumar Aditya <kumaraditya@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>
* gh-135755: Move `PyFunction_GET_BUILTINS` to the private API (GH-135938)Peter Bierma2025-06-261-1/+1
|
* gh-132775: Fix _PyFunctIon_VerifyStateless() (#134900)Eric Snow2025-05-291-14/+20
| | | | | | The problem we're fixing here is that we were using PyDict_Size() on "defaults", which it is actually a tuple. We're also adding some explicit type checks. This is a follow-up to gh-133221/gh-133528.
* gh-132775: Unrevert "Add _PyCode_VerifyStateless()" (gh-133528)Eric Snow2025-05-081-0/+54
| | | | | | | | This reverts commit 3c73cf5 (gh-133497), which itself reverted the original commit d270bb5 (gh-133221). We reverted the original change due to failing android tests. The checks in _PyCode_CheckNoInternalState() were too strict, so we've relaxed them.
* gh-132775: Revert "gh-132775: Add _PyCode_VerifyStateless() (gh-133221)" ↵Petr Viktorin2025-05-061-54/+0
| | | | (#133497)
* gh-132775: Add _PyCode_VerifyStateless() (gh-133221)Eric Snow2025-05-051-0/+54
| | | | | | | | "Stateless" code is a function or code object which does not rely on external state or internal state. It may rely on arguments and builtins, but not globals or a closure. I've left a comment in pycore_code.h that provides more detail. We also add _PyFunction_VerifyStateless(). The new functions will be used in several later changes that facilitate "sharing" functions and code objects between interpreters.
* gh-132457: make staticmethod and classmethod generic (#132460)Ivan Kirpichnikov2025-05-041-2/+12
| | | Co-authored-by: sobolevn <mail@sobolevn.me>
* gh-131238: Remove pycore_object_deferred.h from pycore_object.h (#131549)Victor Stinner2025-03-211-6/+6
| | | Remove also pycore_function.h from pycore_typeobject.h.
* GH-131238: More refactoring of core header files (GH-131351)Mark Shannon2025-03-171-0/+1
| | | | Adds new pycore_stats.h header file to help break dependencies involving the pycore_code.h header.
* gh-128714: Fix function object races in `__annotate__`, `__annotations__` ↵Xuanteng Huang2025-02-061-44/+87
| | | | and `__type_params__` in free-threading build (#129016)
* gh-127274: Defer nested methods (#128012)mpage2024-12-191-1/+5
| | | | | | | Methods (functions defined in class scope) are likely to be cleaned up by the GC anyway. Add a new code flag, `CO_METHOD`, that is set for functions defined in a class scope. Use that when deciding to defer functions.
* gh-127582: Make object resurrection thread-safe for free threading. (GH-127612)Sam Gross2024-12-051-5/+2
| | | | | | | | | | | | Objects may be temporarily "resurrected" in destructors when calling finalizers or watcher callbacks. We previously undid the resurrection by decrementing the reference count using `Py_SET_REFCNT`. This was not thread-safe because other threads might be accessing the object (modifying its reference count) if it was exposed by the finalizer, watcher callback, or temporarily accessed by a racy dictionary or list access. This adds internal-only thread-safe functions for temporary object resurrection during destructors.
* gh-115999: Specialize `LOAD_GLOBAL` in free-threaded builds (#126607)mpage2024-11-211-0/+2
| | | | | | | | | | | | | | Enable specialization of LOAD_GLOBAL in free-threaded builds. Thread-safety of specialization in free-threaded builds is provided by the following: A critical section is held on both the globals and builtins objects during specialization. This ensures we get an atomic view of both builtins and globals during specialization. Generation of new keys versions is made atomic in free-threaded builds. Existing helpers are used to atomically modify the opcode. Thread-safety of specialized instructions in free-threaded builds is provided by the following: Relaxed atomics are used when loading and storing dict keys versions. This avoids potential data races as the dict keys versions are read without holding the dictionary's per-object lock in version guards. Dicts keys objects are passed from keys version guards to the downstream uops. This ensures that we are loading from the correct offset in the keys object. Once a unicode key has been stored in a keys object for a combined dictionary in free-threaded builds, the offset that it is stored in will never be reused for a different key. Once the version guard passes, we know that we are reading from the correct offset. The dictionary read fast-path is used to read values from the dictionary once we know the correct offset.
* gh-126072: do not add `None` to `co_consts` if there is no docstring (GH-126101)Xuanteng Huang2024-10-301-1/+2
|
* gh-124218: Avoid refcount contention on builtins module (GH-125847)Sam Gross2024-10-241-22/+3
| | | | | | | This replaces `_PyEval_BuiltinsFromGlobals` with `_PyDict_LoadBuiltinsFromGlobals`, which returns a new reference instead of a borrowed reference. Internally, the new function uses per-thread reference counting when possible to avoid contention on the refcount fields on the builtins module.
* gh-124218: Use per-thread reference counting for globals and builtins (#125713)Sam Gross2024-10-211-6/+32
| | | | Use per-thread refcounting for the reference from function objects to the globals and builtins dictionaries.
* gh-125017: Fix refleak from GH-125636 (GH-125664)Zachary Ware2024-10-171-0/+1
|
* gh-125017: Fix crash on premature access to classmethod/staticmethod ↵Jelle Zijlstra2024-10-171-14/+27
| | | | annotations (#125636)
* gh-124218: Use per-thread refcounts for code objects (#125216)Sam Gross2024-10-151-3/+5
| | | | | | | Use per-thread refcounting for the reference from function objects to their corresponding code object. This can be a source of contention when frequently creating nested functions. Deferred refcounting alone isn't a great fit here because these references are on the heap and may be modified by other libraries.
* gh-115999: Stop the world when invalidating function versions (#124997)mpage2024-10-081-33/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop the world when invalidating function versions The tier1 interpreter specializes `CALL` instructions based on the values of certain function attributes (e.g. `__code__`, `__defaults__`). The tier1 interpreter uses function versions to verify that the attributes of a function during execution of a specialization match those seen during specialization. A function's version is initialized in `MAKE_FUNCTION` and is invalidated when any of the critical function attributes are changed. The tier1 interpreter stores the function version in the inline cache during specialization. A guard is used by the specialized instruction to verify that the version of the function on the operand stack matches the cached version (and therefore has all of the expected attributes). It is assumed that once the guard passes, all attributes will remain unchanged while executing the rest of the specialized instruction. Stopping the world when invalidating function versions ensures that all critical function attributes will remain unchanged after the function version guard passes in free-threaded builds. It's important to note that this is only true if the remainder of the specialized instruction does not enter and exit a stop-the-world point. We will stop the world the first time any of the following function attributes are mutated: - defaults - vectorcall - kwdefaults - closure - code This should happen rarely and only happens once per function, so the performance impact on majority of code should be minimal. Additionally, refactor the API for manipulating function versions to more clearly match the stated semantics.
* gh-111178: Fix function signatures in funcobject.c (#124908)Victor Stinner2024-10-021-78/+116
|
* gh-122229: Add missing `Py_DECREF` in `func_get_annotation_dict` (#122230)sobolevn2024-07-241-0/+1
|
* gh-119180: Lazily wrap annotations on classmethod and staticmethod (#119864)Jelle Zijlstra2024-05-311-1/+99
|
* gh-119180: PEP 649: Add __annotate__ attributes (#119209)Jelle Zijlstra2024-05-221-3/+61
|
* gh-117657: Disable the function/code cache in free-threaded builds (#118301)mpage2024-05-031-0/+10
| | | | This is only used by the specializing interpreter and the tier 2 optimizer, both of which are disabled in free-threaded builds.
* gh-117376: Partial implementation of deferred reference counting (#117696)Sam Gross2024-04-121-0/+9
| | | | | This marks objects as using deferred refrence counting using the `ob_gc_bits` field in the free-threaded build and collects those objects during GC.
* gh-117045: Add code object to function version cache (#117028)Guido van Rossum2024-03-211-53/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | Changes to the function version cache: - In addition to the function object, also store the code object, and allow the latter to be retrieved even if the function has been evicted. - Stop assigning new function versions after a critical attribute (e.g. `__code__`) has been modified; the version is permanently reset to zero in this case. - Changes to `__annotations__` are no longer considered critical. (This fixes gh-109998.) Changes to the Tier 2 optimization machinery: - If we cannot map a function version to a function, but it is still mapped to a code object, we continue projecting the trace. The operand of the `_PUSH_FRAME` and `_POP_FRAME` opcodes can be either NULL, a function object, or a code object with the lowest bit set. This allows us to trace through code that calls an ephemeral function, i.e., a function that may not be alive when we are constructing the executor, e.g. a generator expression or certain nested functions. We will lose globals removal inside such functions, but we can still do other peephole operations (and even possibly [call inlining](https://github.com/python/cpython/pull/116290), if we decide to do it), which only need the code object. As before, if we cannot retrieve the code object from the cache, we stop projecting.
* gh-116916: Remove separate next_func_version counter (#116918)Guido van Rossum2024-03-181-4/+4
| | | | | Somehow we ended up with two separate counter variables tracking "the next function version". Most likely this was a historical accident where an old branch was updated incorrectly. This PR merges the two counters into a single one: `interp->func_state.next_version`.
* gh-114312: Collect stats for unlikely events (GH-114493)Michael Droettboom2024-01-251-0/+9
|
* gh-112640: Add `kwdefaults` parameter to `types.FunctionType.__new__` (#112641)Nikita Sobolev2024-01-111-2/+13
|
* gh-111789: Use PyDict_GetItemRef() in Objects/ (GH-111827)Serhiy Storchaka2023-11-141-5/+4
|
* gh-111999: Add signatures and improve docstrings for builtins (GH-112000)Serhiy Storchaka2023-11-131-2/+4
|
* gh-81137: deprecate assignment of code object to a function of a mismatched ↵Irit Katriel2023-11-071-0/+14
| | | | type (#111823)
* gh-108082: Use PyErr_FormatUnraisable() (GH-111580)Serhiy Storchaka2023-11-021-17/+3
| | | | | | Replace most of calls of _PyErr_WriteUnraisableMsg() and some calls of PyErr_WriteUnraisable(NULL) with PyErr_FormatUnraisable(). Co-authored-by: Victor Stinner <vstinner@python.org>
* gh-89519: Remove classmethod descriptor chaining, deprecated since 3.11 ↵Raymond Hettinger2023-10-271-4/+0
| | | | (gh-110163)
* gh-110964: Remove private _PyArg functions (#110966)Victor Stinner2023-10-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the following private functions and structures to pycore_modsupport.h internal C API: * _PyArg_BadArgument() * _PyArg_CheckPositional() * _PyArg_NoKeywords() * _PyArg_NoPositional() * _PyArg_ParseStack() * _PyArg_ParseStackAndKeywords() * _PyArg_Parser structure * _PyArg_UnpackKeywords() * _PyArg_UnpackKeywordsWithVararg() * _PyArg_UnpackStack() * _Py_ANY_VARARGS() Changes: * Python/getargs.h now includes pycore_modsupport.h to export functions. * clinic.py now adds pycore_modsupport.h when one of these functions is used. * Add pycore_modsupport.h includes when a C extension uses one of these functions. * Define Py_BUILD_CORE_MODULE in C extensions which now include directly or indirectly (via code generated by Argument Clinic) pycore_modsupport.h: * _csv * _curses_panel * _dbm * _gdbm * _multiprocessing.posixshmem * _sqlite.row * _statistics * grp * resource * syslog * _testcapi: bad_get() no longer uses METH_FASTCALL calling convention but METH_VARARGS. Replace _PyArg_UnpackStack() with PyArg_ParseTuple(). * _testcapi: add PYTESTCAPI_NEED_INTERNAL_API macro which is defined by _testcapi sub-modules which need the internal C API (pycore_modsupport.h): exceptions.c, float.c, vectorcall.c, watchers.c. * Remove Include/cpython/modsupport.h header file. Include/modsupport.h no longer includes the removed header file. * Fix mypy clinic.py
* GH-104584: Fix refleak when tracing through calls (GH-110593)Brandt Bucher2023-10-101-1/+1
|
* GH-108716: Turn off deep-freezing of code objects. (GH-108722)Mark Shannon2023-09-081-6/+3
|
* gh-108253: Fix reads of uninitialized memory in funcobject.c (#108383)Guido van Rossum2023-08-231-2/+2
|
* gh-108253: Fix bug in func version cache (#108296)Guido van Rossum2023-08-221-22/+22
| | | | When a function object changed its version, a stale pointer might remain in the cache. Zap these whenever `func_version` changes (even when set to 0).
* gh-106581: Project through calls (#108067)Guido van Rossum2023-08-171-2/+77
| | | | This finishes the work begun in gh-107760. When, while projecting a superblock, we encounter a call to a short, simple function, the superblock will now enter the function using `_PUSH_FRAME`, continue through it, and leave it using `_POP_FRAME`, and then continue through the original code. Multiple frame pushes and pops are even possible. It is also possible to stop appending to the superblock in the middle of a called function, when running out of space or encountering an unsupported bytecode.
* GH-84436: Skip refcounting for known immortals (GH-107605)Brandt Bucher2023-08-041-2/+2
|
* gh-106869: Use new PyMemberDef constant names (#106871)Victor Stinner2023-07-251-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove '#include "structmember.h"'. * If needed, add <stddef.h> to get offsetof() function. * Update Parser/asdl_c.py to regenerate Python/Python-ast.c. * Replace: * T_SHORT => Py_T_SHORT * T_INT => Py_T_INT * T_LONG => Py_T_LONG * T_FLOAT => Py_T_FLOAT * T_DOUBLE => Py_T_DOUBLE * T_STRING => Py_T_STRING * T_OBJECT => _Py_T_OBJECT * T_CHAR => Py_T_CHAR * T_BYTE => Py_T_BYTE * T_UBYTE => Py_T_UBYTE * T_USHORT => Py_T_USHORT * T_UINT => Py_T_UINT * T_ULONG => Py_T_ULONG * T_STRING_INPLACE => Py_T_STRING_INPLACE * T_BOOL => Py_T_BOOL * T_OBJECT_EX => Py_T_OBJECT_EX * T_LONGLONG => Py_T_LONGLONG * T_ULONGLONG => Py_T_ULONGLONG * T_PYSSIZET => Py_T_PYSSIZET * T_NONE => _Py_T_NONE * READONLY => Py_READONLY * PY_AUDIT_READ => Py_AUDIT_READ * READ_RESTRICTED => Py_AUDIT_READ * PY_WRITE_RESTRICTED => _Py_WRITE_RESTRICTED * RESTRICTED => (READ_RESTRICTED | _Py_WRITE_RESTRICTED)
* gh-106521: Remove _PyObject_LookupAttr() function (GH-106642)Serhiy Storchaka2023-07-121-1/+1
|
* gh-106303: Use _PyObject_LookupAttr() instead of PyObject_GetAttr() (GH-106304)Serhiy Storchaka2023-07-091-10/+5
| | | | It simplifies and speed up the code.
* gh-106033: Get rid of PyDict_GetItem in _PyFunction_FromConstructor (GH-106044)Serhiy Storchaka2023-06-291-4/+6
|
* gh-104600: Make function.__type_params__ writable (#104601)Jelle Zijlstra2023-05-181-1/+16
|