summaryrefslogtreecommitdiffstats
path: root/Python/pystate.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-115999: Enable specialization of `CALL` instructions in free-threaded ↵mpage2024-12-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | builds (#127123) The CALL family of instructions were mostly thread-safe already and only required a small number of changes, which are documented below. A few changes were needed to make CALL_ALLOC_AND_ENTER_INIT thread-safe: Added _PyType_LookupRefAndVersion, which returns the type version corresponding to the returned ref. Added _PyType_CacheInitForSpecialization, which takes an init method and the corresponding type version and only populates the specialization cache if the current type version matches the supplied version. This prevents potentially caching a stale value in free-threaded builds if we race with an update to __init__. Only cache __init__ functions that are deferred in free-threaded builds. This ensures that the reference to __init__ that is stored in the specialization cache is valid if the type version guard in _CHECK_AND_ALLOCATE_OBJECT passes. Fix a bug in _CREATE_INIT_FRAME where the frame is pushed to the stack on failure. A few other miscellaneous changes were also needed: Use {LOCK,UNLOCK}_OBJECT in LIST_APPEND. This ensures that the list's per-object lock is held while we are appending to it. Add missing co_tlbc for _Py_InitCleanup. Stop/start the world around setting the eval frame hook. This allows us to read interp->eval_frame non-atomically and preserves the behavior of _CHECK_PEP_523 documented below.
* gh-109746: Make _thread.start_new_thread delete state of new thread on its ↵Radislav Chugunov2024-11-221-1/+3
| | | | | | | | | | startup failure (GH-109761) If Python fails to start newly created thread due to failure of underlying PyThread_start_new_thread() call, its state should be removed from interpreter' thread states list to avoid its double cleanup. Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* gh-114940: Add _Py_FOR_EACH_TSTATE_UNLOCKED(), and Friends (gh-127077)Eric Snow2024-11-211-52/+46
| | | This is a precursor to the actual fix for gh-114940, where we will change these macros to use the new lock. This change is almost entirely mechanical; the exceptions are the loops in codeobject.c and ceval.c, which now hold the "head" lock. Note that almost all of the uses of _Py_FOR_EACH_TSTATE_UNLOCKED() here will change to _Py_FOR_EACH_TSTATE_BEGIN() once we add the new per-interpreter lock.
* gh-121058: Warn if `PyThreadState_Clear` is called with an exception set ↵Peter Bierma2024-11-201-0/+5
| | | | (gh-121343)
* gh-126914: Store the Preallocated Thread State's Pointer in a ↵Eric Snow2024-11-191-47/+46
| | | | | PyInterpreterState Field (gh-126989) This approach eliminates the originally reported race. It also gets rid of the deadlock reported in gh-96071, so we can remove the workaround added then.
* gh-126986: Drop _PyInterpreterState_FailIfNotRunning() (gh-126988)Eric Snow2024-11-191-12/+8
| | | We replace it with _PyErr_SetInterpreterAlreadyRunning().
* gh-76785: Minor Cleanup of "Cross-interpreter" Code (gh-126457)Eric Snow2024-11-071-1/+1
| | | | | | | | The primary objective here is to allow some later changes to be cleaner. Mostly this involves renaming things and moving a few things around. * CrossInterpreterData -> XIData * crossinterpdatafunc -> xidatafunc * split out pycore_crossinterp_data_registry.h * add _PyXIData_lookup_t
* gh-115999: Implement thread-local bytecode and enable specialization for ↵mpage2024-11-041-0/+10
| | | | | | | | | `BINARY_OP` (#123926) Each thread specializes a thread-local copy of the bytecode, created on the first RESUME, in free-threaded builds. All copies of the bytecode for a code object are stored in the co_tlbc array on the code object. Threads reserve a globally unique index identifying its copy of the bytecode in all co_tlbc arrays at thread creation and release the index at thread destruction. The first entry in every co_tlbc array always points to the "main" copy of the bytecode that is stored at the end of the code object. This ensures that no bytecode is copied for programs that do not use threads. Thread-local bytecode can be disabled at runtime by providing either -X tlbc=0 or PYTHON_TLBC=0. Disabling thread-local bytecode also disables specialization. Concurrent modifications to the bytecode made by the specializing interpreter and instrumentation use atomics, with specialization taking care not to overwrite an instruction that was instrumented concurrently.
* gh-125286: Share the Main Refchain With Legacy Interpreters (gh-125709)Eric Snow2024-10-231-4/+2
| | | | | They used to be shared, before 3.12. Returning to sharing them resolves a failure on Py_TRACE_REFS builds. Co-authored-by: Petr Viktorin <encukou@gmail.com>
* gh-125604: Move _Py_AuditHookEntry, etc. Out of pycore_runtime.h (gh-125605)Eric Snow2024-10-181-1/+1
| | | | | | | | | | | | This is essentially a cleanup, moving a handful of API declarations to the header files where they fit best, creating new ones when needed. We do the following: * add pycore_debug_offsets.h and move _Py_DebugOffsets, etc. there * inline struct _getargs_runtime_state and struct _gilstate_runtime_state in _PyRuntimeState * move struct _reftracer_runtime_state to the existing pycore_object_state.h * add pycore_audit.h and move to it _Py_AuditHookEntry , _PySys_Audit(), and _PySys_ClearAuditHooks * add audit.h and cpython/audit.h and move the existing audit-related API there *move the perfmap/trampoline API from cpython/sysmodule.h to cpython/ceval.h, and remove the now-empty cpython/sysmodule.h
* gh-124218: Use per-thread refcounts for code objects (#125216)Sam Gross2024-10-151-1/+1
| | | | | | | Use per-thread refcounting for the reference from function objects to their corresponding code object. This can be a source of contention when frequently creating nested functions. Deferred refcounting alone isn't a great fit here because these references are on the heap and may be modified by other libraries.
* gh-111924: use atomics for interp id refcounting (#125321)Kumar Aditya2024-10-121-48/+6
|
* gh-116750: Add clear_tool_id function to unregister events and callbacks ↵Tian Gao2024-10-011-0/+1
| | | | (#124568)
* gh-124218: Refactor per-thread reference counting (#124844)Sam Gross2024-10-011-3/+3
| | | | | | | Currently, we only use per-thread reference counting for heap type objects and the naming reflects that. We will extend it to a few additional types in an upcoming change to avoid scaling bottlenecks when creating nested functions. Rename some of the files and functions in preparation for this change.
* GH-123516: Improve JIT memory consumption by invalidating cold executors ↵Savannah Ostrowski2024-09-271-0/+1
| | | | | (GH-124443) Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
* gh-119333: Add C api to have contextvar enter/exit callbacks (#119335)Jason Fried2024-09-241-0/+5
| | | Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
* Remove comment from pystate created in 2003 (#123259)Anthony Shaw2024-08-241-5/+0
|
* Add debug offsets for free threaded builds (#123041)Pablo Galindo Salgado2024-08-151-1/+3
|
* gh-122697: Fix free-threading memory leaks at shutdown (#122703)Sam Gross2024-08-081-1/+1
| | | | | | | | | | | | | | | We were not properly accounting for interpreter memory leaks at shutdown and had two sources of leaks: * Objects that use deferred reference counting and were reachable via static types outlive the final GC. We now disable deferred reference counting on all objects if we are calling the GC due to interpreter shutdown. * `_PyMem_FreeDelayed` did not properly check for interpreter shutdown so we had some memory blocks that were enqueued to be freed, but never actually freed. * `_PyType_FinalizeIdPool` wasn't called at interpreter shutdown.
* gh-122417: Implement per-thread heap type refcounts (#122418)Sam Gross2024-08-061-7/+6
| | | | | | | The free-threaded build partially stores heap type reference counts in distributed manner in per-thread arrays. This avoids reference count contention when creating or destroying instances. Co-authored-by: Ken Jin <kenjin@python.org>
* gh-100240: Use a consistent implementation for freelists (#121934)Sam Gross2024-07-221-2/+2
| | | | | | | | This combines and updates our freelist handling to use a consistent implementation. Objects in the freelist are linked together using the first word of memory block. If configured with freelists disabled, these operations are essentially no-ops.
* gh-120973: Fix thread-safety issues with `threading.local` (#121655)mpage2024-07-191-0/+3
| | | | | | This is a small refactoring to the current design that allows us to avoid manually iterating over threads. This should also fix gh-118490.
* gh-121621: Move asyncio_running_loop to private struct (#121939)Sam Gross2024-07-171-2/+2
| | | | This avoids changing the ABI and keeps the field in the private struct.
* gh-121621: Move asyncio running loop to thread state (GH-121695)Ken Jin2024-07-161-0/+4
|
* gh-120838: Add _PyThreadState_WHENCE_FINI (gh-121010)Eric Snow2024-06-251-5/+9
| | | | | We also add _PyThreadState_NewBound() and drop _PyThreadState_SetWhence(). This change only affects internal API.
* gh-120726: Fix compiler warnings on is_core_module() (#120727)Kirill Podoprigora2024-06-191-3/+4
| | | | | Fix compiler warnings on is_core_module() and check_interpreter_whence(): only define them when assertions are built.
* gh-117657: Fix race involving GC and heap initialization (#119923)Sam Gross2024-06-041-0/+2
| | | | | | | | | | | | The `_PyThreadState_Bind()` function is called before the first `PyEval_AcquireThread()` so it's not synchronized with the stop the world GC. We had a race where `gc_visit_heaps()` might visit a thread's heap while it's being initialized. Use a simple atomic int to avoid visiting heaps for threads that are not yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called). The race was reproducible by running: `python Lib/test/test_importlib/partial/pool_in_threads.py`.
* gh-117657: Fix race involving immortalizing objects (#119927)Sam Gross2024-06-031-3/+1
| | | | | | | | | The free-threaded build currently immortalizes objects that use deferred reference counting (see gh-117783). This typically happens once the first non-main thread is created, but the behavior can be suppressed for tests, in subinterpreters, or during a compile() call. This fixes a race condition involving the tracking of whether the behavior is suppressed.
* gh-119369: Fix deadlock during thread exit in free-threaded build (#119528)Sam Gross2024-05-311-9/+12
| | | | | | | Release the GIL before calling `_Py_qsbr_unregister`. The deadlock could occur when the GIL was enabled at runtime. The `_Py_qsbr_unregister` call might block while holding the GIL because the thread state was not active, but the GIL was still held.
* gh-119585: Fix crash involving `PyGILState_Release()` and ↵Sam Gross2024-05-311-0/+6
| | | | | | | | | | `PyThreadState_Clear()` (#119753) Make sure that `gilstate_counter` is not zero in when calling `PyThreadState_Clear()`. A destructor called from `PyThreadState_Clear()` may call back into `PyGILState_Ensure()` and `PyGILState_Release()`. If `gilstate_counter` is zero, it will try to create a new thread state before the current active thread state is destroyed, leading to an assertion failure or crash.
* gh-118727: Don't drop the GIL in `drop_gil()` unless the current thread ↵Brett Simmers2024-05-231-7/+4
| | | | | | | | | | | | | | | | | holds it (#118745) `drop_gil()` assumes that its caller is attached, which means that the current thread holds the GIL if and only if the GIL is enabled, and the enabled-state of the GIL won't change. This isn't true, though, because `detach_thread()` calls `_PyEval_ReleaseLock()` after detaching and `_PyThreadState_DeleteCurrent()` calls it after removing the current thread from consideration for stop-the-world requests (effectively detaching it). Fix this by remembering whether or not a thread acquired the GIL when it last attached, in `PyThreadState._status.holds_gil`, and check this in `drop_gil()` instead of `gil->enabled`. This fixes a crash in `test_multiprocessing_pool_circular_import()`, so I've reenabled it.
* gh-117657: Fix QSBR race condition (#118843)Alex Turner2024-05-101-1/+1
| | | | | | `_Py_qsbr_unregister` is called when the PyThreadState is already detached, so the access to `tstate->qsbr` isn't safe without locking the shared mutex. Grab the `struct _qsbr_shared` from the interpreter instead.
* gh-117657: Fix data races reported by TSAN on `interp->threads.main` (#118865)mpage2024-05-101-11/+20
| | | Use relaxed loads/stores when reading/writing to this field.
* gh-116322: Enable the GIL while loading C extension modules (#118560)Brett Simmers2024-05-071-8/+25
| | | | | | | | | | Add the ability to enable/disable the GIL at runtime, and use that in the C module loading code. We can't know before running a module init function if it supports free-threading, so the GIL is temporarily enabled before doing so. If the module declares support for running without the GIL, the GIL is later disabled. Otherwise, the GIL is permanently enabled, and will never be disabled again for the life of the current interpreter.
* gh-112075: use per-thread dict version pool (#118676)Dino Viehland2024-05-071-0/+1
| | | use thread state set of dict versions
* gh-118527: Intern code consts in free-threaded build (#118667)Sam Gross2024-05-071-0/+1
| | | | | | We already intern and immortalize most string constants. In the free-threaded build, other constants can be a source of reference count contention because they are shared by all threads running the same code objects.
* gh-116738: Make `_codecs` module thread-safe (#117530)Brett Simmers2024-05-021-3/+1
| | | | | | | | | | | | | | | The module itself is a thin wrapper around calls to functions in `Python/codecs.c`, so that's where the meaningful changes happened: - Move codecs-related state that lives on `PyInterpreterState` to a struct declared in `pycore_codecs.h`. - In free-threaded builds, add a mutex to `codecs_state` to synchronize operations on `search_path`. Because `search_path_mutex` is used as a normal mutex and not a critical section, we must be extremely careful with operations called while holding it. - The codec registry is explicitly initialized as part of `_PyUnicode_InitEncodings` to simplify thread-safety.
* gh-118335: Configure Tier 2 interpreter at build time (#118339)Guido van Rossum2024-05-011-0/+6
| | | | | | | | | | | | | | | | | | | | | | The code for Tier 2 is now only compiled when configured with `--enable-experimental-jit[=yes|interpreter]`. We drop support for `PYTHON_UOPS` and -`Xuops`, but you can disable the interpreter or JIT at runtime by setting `PYTHON_JIT=0`. You can also build it without enabling it by default using `--enable-experimental-jit=yes-off`; enable with `PYTHON_JIT=1`. On Windows, the `build.bat` script supports `--experimental-jit`, `--experimental-jit-off`, `--experimental-interpreter`. In the C code, `_Py_JIT` is defined as before when the JIT is enabled; the new variable `_Py_TIER2` is defined when the JIT *or* the interpreter is enabled. It is actually a bitmask: 1: JIT; 2: default-off; 4: interpreter.
* gh-118332: Fix deadlock involving stop the world (#118412)Sam Gross2024-04-301-1/+2
| | | | | | Avoid detaching thread state when stopping the world. When re-attaching the thread state, the thread would attempt to resume the top-most critical section, which might now be held by a thread paused for our stop-the-world request.
* gh-117783: Immortalize objects that use deferred reference counting (#118112)Sam Gross2024-04-291-0/+11
| | | | | | | | | Deferred reference counting is not fully implemented yet. As a temporary measure, we immortalize objects that would use deferred reference counting to avoid multi-threaded scaling bottlenecks. This is only performed in the free-threaded build once the first non-main thread is started. Additionally, some tests, including refleak tests, suppress this behavior.
* gh-117657: Quiet TSAN warnings about remaining non-atomic accesses of ↵mpage2024-04-231-1/+1
| | | | | `tstate->state` (#118165) Quiet TSAN warnings about remaining non-atomic accesses of `tstate->state`
* gh-116818: Make `sys.settrace`, `sys.setprofile`, and monitoring thread-safe ↵Dino Viehland2024-04-191-0/+1
| | | | | | | (#116775) Makes sys.settrace, sys.setprofile, and monitoring generally thread-safe. Mostly uses a stop-the-world approach and synchronization around the code object's _co_instrumentation_version. There may be a little bit of extra synchronization around the monitoring data that's required to be TSAN clean.
* GH-117760: Streamline the trashcan mechanism (GH-117763)Mark Shannon2024-04-171-0/+2
|
* gh-117657: Quiet more TSAN warnings due to incorrect modeling of ↵mpage2024-04-151-2/+2
| | | | compare/exchange (#117830)
* gh-117657: Quiet TSAN warning about a data race between `start_the_world()` ↵mpage2024-04-151-1/+2
| | | | | | | | and `tstate_try_attach()` (#117828) TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int` on `tstate->state` in `tstate_try_attach()` and the non-atomic load of `tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails, but TSAN erroneously treats it as a store.
* gh-76785: Handle Legacy Interpreters Properly (gh-117490)Eric Snow2024-04-111-0/+7
| | | This is similar to the situation with threading._DummyThread. The methods (incl. __del__()) of interpreters.Interpreter objects must be careful with interpreters not created by interpreters.create(). The simplest thing to start with is to disable any method that modifies or runs in the interpreter. As part of this, the runtime keeps track of where an interpreter was created. We also handle interpreter "refcounts" properly.
* gh-76785: Add More Tests to test_interpreters.test_api (gh-117662)Eric Snow2024-04-111-2/+53
| | | In addition to the increase test coverage, this is a precursor to sorting out how we handle interpreters created directly via the C-API.
* gh-117439: Make refleak checking thread-safe without the GIL (#117469)Sam Gross2024-04-081-0/+8
| | | | | This keeps track of the per-thread total reference count operations in PyThreadState in the free-threaded builds. The count is merged into the interpreter's total when the thread exits.
* gh-111926: Make weakrefs thread-safe in free-threaded builds (#117168)mpage2024-04-081-0/+9
| | | | | | | | | Most mutable data is protected by a striped lock that is keyed on the referenced object's address. The weakref's hash is protected using the weakref's per-object lock. Note that this only affects free-threaded builds. Apart from some minor refactoring, the added code is all either gated by `ifdef`s or is a no-op (e.g. `Py_BEGIN_CRITICAL_SECTION`).
* gh-76785: Raise InterpreterError, Not RuntimeError (gh-117489)Eric Snow2024-04-031-1/+1
| | | | | I had meant to switch everything to InterpreterError when I added it a while back. At the time I missed a few key spots. As part of this, I've added print-the-exception to _PyXI_InitTypes() and fixed an error case in `_PyStaticType_InitBuiltin().