summaryrefslogtreecommitdiffstats
path: root/Python/pystate.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-116522: Refactor `_PyThreadState_DeleteExcept` (#117131)Sam Gross2024-03-211-15/+24
| | | | | | | | | | | Split `_PyThreadState_DeleteExcept` into two functions: - `_PyThreadState_RemoveExcept` removes all thread states other than one passed as an argument. It returns the removed thread states as a linked list. - `_PyThreadState_DeleteList` deletes those dead thread states. It may call destructors, so we want to "start the world" before calling `_PyThreadState_DeleteList` to avoid potential deadlocks.
* gh-76785: Drop PyInterpreterID_Type (gh-117101)Eric Snow2024-03-211-5/+0
| | | I added it quite a while ago as a strategy for managing interpreter lifetimes relative to the PEP 554 (now 734) implementation. Relatively recently I refactored that implementation to no longer rely on InterpreterID objects. Thus now I'm removing it.
* gh-105716: Update interp->threads.main After Fork (gh-117049)Eric Snow2024-03-211-0/+35
| | | | | I missed this in gh-109921. We also update Py_Exit() to call _PyInterpreterState_SetNotRunningMain(), if necessary.
* gh-76785: Clean Up Interpreter ID Conversions (gh-117048)Eric Snow2024-03-211-24/+79
| | | Mostly we unify the two different implementations of the conversion code (from PyObject * to int64_t. We also drop the PyArg_ParseTuple()-style converter function, as well as rename and move PyInterpreterID_LookUp().
* gh-116522: Stop the world before fork() and during shutdown (#116607)Sam Gross2024-03-211-0/+6
| | | | | | | | | | | This changes the free-threaded build to perform a stop-the-world pause before deleting other thread states when forking and during shutdown. This fixes some crashes when using multiprocessing and during shutdown when running with `PYTHON_GIL=0`. This also changes `PyOS_BeforeFork` to acquire the runtime lock (i.e., `HEAD_LOCK(&_PyRuntime)`) before forking to ensure that data protected by the runtime lock (and not just the GIL or stop-the-world) is in a consistent state before forking.
* gh-116916: Remove separate next_func_version counter (#116918)Guido van Rossum2024-03-181-1/+0
| | | | | Somehow we ended up with two separate counter variables tracking "the next function version". Most likely this was a historical accident where an old branch was updated incorrectly. This PR merges the two counters into a single one: `interp->func_state.next_version`.
* gh-114271: Fix race in `Thread.join()` (#114839)mpage2024-03-161-24/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()` and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following execution involving threads A, B, and C: 1. A starts. 2. B joins A, blocking on its `_tstate_lock`. 3. C joins A, blocking on its `_tstate_lock`. 4. A finishes and releases its `_tstate_lock`. 5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped out before calling `_stop()`. 6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped out before releasing it. 7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held. However, C holds it, so the assertion fails. The race can be reproduced[^3] by inserting sleeps at the appropriate points in the threading code. To do so, run the `repro_join_race.py` from the linked repo. There are two main parts to this PR: 1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`. The event is set by the runtime prior to the thread being cleared (in the same place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the event to be set. 2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all non-daemon threads to exit. To do so, an `is_daemon` predicate was added to `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()` now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on `_tstate_lock`s. [^1]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201 [^2]: https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115 [^3]: https://github.com/mpage/cpython/commit/81946532792f938cd6f6ab4c4ff92a4edf61314f --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Antoine Pitrou <antoine@python.org>
* gh-116515: Clear thread-local state before tstate_delete_common() (#116517)Sam Gross2024-03-111-1/+2
| | | | | | | This moves `current_fast_clear()` up so that the current thread state is `NULL` while running `tstate_delete_common()`. This doesn't fix any bugs, but it means that we are more consistent that `_PyThreadState_GET() != NULL` means that the thread is "attached".
* gh-116396: Pass "detached_state" argument to tstate_set_detached (#116398)Sam Gross2024-03-071-6/+6
| | | | The stop-the-world code was incorrectly setting suspended threads' states to _Py_THREAD_DETACHED instead of _Py_THREAD_SUSPENDED.
* gh-115103: Delay reuse of mimalloc pages that store PyObjects (#115435)Sam Gross2024-03-061-0/+7
| | | | | | | | | | | | | | | | | | This implements the delayed reuse of mimalloc pages that contain Python objects in the free-threaded build. Allocations of the same size class are grouped in data structures called pages. These are different from operating system pages. For thread-safety, we want to ensure that memory used to store PyObjects remains valid as long as there may be concurrent lock-free readers; we want to delay using it for other size classes, in other heaps, or returning it to the operating system. When a mimalloc page becomes empty, instead of immediately freeing it, we tag it with a QSBR goal and insert it into a per-thread state linked list of pages to be freed. When mimalloc needs a fresh page, we process the queue and free any still empty pages that are now deemed safe to be freed. Pages waiting to be freed are still available for allocations of the same size class and allocating from a page prevent it from being freed. There is additional logic to handle abandoned pages when threads exit.
* gh-115832: Fix instrumentation version mismatch during interpreter shutdown ↵Brett Simmers2024-03-041-0/+3
| | | | | | | | | | | | | (#115856) A previous commit introduced a bug to `interpreter_clear()`: it set `interp->ceval.instrumentation_version` to 0, without making the corresponding change to `tstate->eval_breaker` (which holds a thread-local copy of the version). After this happens, Python code can still run due to object finalizers during a GC, and the version check in bytecodes.c will see a different result than the one in instrumentation.c causing an infinite loop. The fix itself is straightforward: clear `tstate->eval_breaker` when clearing `interp->ceval.instrumentation_version`.
* gh-116012: Preserve GetLastError() across calls to TlsGetValue on Windows ↵Steve Dower2024-02-281-9/+0
| | | | (GH-116014)
* gh-115168: Add pystats counter for invalidated executors (GH-115169)Michael Droettboom2024-02-261-1/+1
|
* gh-115103: Implement delayed free mechanism for free-threaded builds (#115367)Sam Gross2024-02-201-0/+6
| | | | | | This adds `_PyMem_FreeDelayed()` and supporting functions. The `_PyMem_FreeDelayed()` function frees memory with the same allocator as `PyMem_Free()`, but after some delay to ensure that concurrent lock-free readers have finished.
* gh-115491: Keep some fields valid across allocations (free-threading) (#115573)Sam Gross2024-02-201-0/+15
| | | | | This avoids filling the memory occupied by ob_tid, ob_ref_local, and ob_ref_shared with debug bytes (e.g., 0xDD) in mimalloc in the free-threaded build.
* gh-110850: Replace _PyTime_t with PyTime_t (#115719)Victor Stinner2024-02-201-1/+1
| | | | | Run command: sed -i -e 's!\<_PyTime_t\>!PyTime_t!g' $(find -name "*.c" -o -name "*.h")
* gh-112175: Add `eval_breaker` to `PyThreadState` (#115194)Brett Simmers2024-02-201-9/+7
| | | | | | | | | | | This change adds an `eval_breaker` field to `PyThreadState`. The primary motivation is for performance in free-threaded builds: with thread-local eval breakers, we can stop a specific thread (e.g., for an async exception) without interrupting other threads. The source of truth for the global instrumentation version is stored in the `instrumentation_version` field in PyInterpreterState. Threads usually read the version from their local `eval_breaker`, where it continues to be colocated with the eval breaker bits.
* GH-112354: Initial implementation of warm up on exits and trace-stitching ↵Mark Shannon2024-02-201-0/+2
| | | | (GH-114142)
* gh-115103: Implement delayed memory reclamation (QSBR) (#115180)Sam Gross2024-02-161-0/+30
| | | | | | This adds a safe memory reclamation scheme based on FreeBSD's "GUS" and quiescent state based reclamation (QSBR). The API provides a mechanism for callers to detect when it is safe to free memory that may be concurrently accessed by readers.
* gh-113743: Use per-interpreter locks for types (#115541)Dino Viehland2024-02-161-1/+1
| | | Move type-lock to per-interpreter lock to avoid heavy contention in interpreters test
* gh-113743: Make the MRO cache thread-safe in free-threaded builds (#113930)Dino Viehland2024-02-151-0/+3
| | | | | | | Makes _PyType_Lookup thread safe, including: Thread safety of the underlying cache. Make mutation of mro and type members thread safe Also _PyType_GetMRO and _PyType_GetBases are currently returning borrowed references which aren't safe.
* gh-115482: Assume the Main Interpreter is Always Running "main" (gh-115484)Eric Snow2024-02-141-1/+8
| | | | | This is a temporary fix to unblock embedders that do not call Py_Main(). _PyInterpreterState_IsRunningMain() will always return true for the main interpreter, even in corner cases where it technically should not. The (future) full solution will do the right thing in those corner cases.
* gh-111968: Rename freelist related struct names to Eric's suggestion (gh-115329)Donghee Na2024-02-141-2/+2
|
* GH-113710: Backedge counter improvements. (GH-115166)Mark Shannon2024-02-131-7/+3
|
* gh-111968: Refactor _PyXXX_Fini to integrate with _PyObject_ClearFreeLists ↵Donghee Na2024-02-101-17/+2
| | | | (gh-114899)
* gh-110481: Implement inter-thread queue for biased reference counting (#114824)Sam Gross2024-02-091-0/+11
| | | | | | | | | Biased reference counting maintains two refcount fields in each object: `ob_ref_local` and `ob_ref_shared`. The true refcount is the sum of these two fields. In some cases, when refcounting operations are split across threads, the ob_ref_shared field can be negative (although the total refcount must be at least zero). In this case, the thread that decremented the refcount requests that the owning thread give up ownership and merge the refcount fields.
* gh-115035: Mark ThreadHandles as non-joinable earlier after forking (#115042)Sam Gross2024-02-061-0/+2
| | | | | | This marks dead ThreadHandles as non-joinable earlier in `PyOS_AfterFork_Child()` before we execute any Python code. The handles are stored in a global linked list in `_PyRuntimeState` because `fork()` affects the entire process.
* gh-104530: Enable native Win32 condition variables by default (GH-104531)Andrew Rogers2024-02-021-1/+11
|
* gh-111968: Use per-thread freelists for dict in free-threading (gh-114323)Donghee Na2024-02-011-0/+3
|
* gh-103323: Remove current_fast_get() unused parameter (#114593)Victor Stinner2024-01-301-26/+24
| | | | The current_fast_get() static inline function doesn't use its 'runtime' parameter, so just remove it.
* gh-113055: Use pointer for interp->obmalloc state (gh-113412)Neil Schemenauer2024-01-271-8/+6
| | | | | | | | | For interpreters that share state with the main interpreter, this points to the same static memory structure. For interpreters with their own obmalloc state, it is heap allocated. Add free_obmalloc_arenas() which will free the obmalloc arenas and radix tree structures for interpreters with their own obmalloc state. Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
* gh-112529: Implement GC for free-threaded builds (#114262)Sam Gross2024-01-251-1/+3
| | | | | | | * gh-112529: Implement GC for free-threaded builds This implements a mark and sweep GC for the free-threaded builds of CPython. The implementation relies on mimalloc to find GC tracked objects (i.e., "containers").
* gh-114312: Collect stats for unlikely events (GH-114493)Michael Droettboom2024-01-251-0/+1
|
* GH-113710: Add a tier 2 peephole optimization pass. (GH-114487)Mark Shannon2024-01-241-3/+7
| | | | | * Convert _LOAD_CONST to inline versions * Remove PEP 523 checks
* gh-111964: Implement stop-the-world pauses (gh-112471)Sam Gross2024-01-231-14/+255
| | | | | | | | | | | | | | | | | The `--disable-gil` builds occasionally need to pause all but one thread. Some examples include: * Cyclic garbage collection, where this is often called a "stop the world event" * Before calling `fork()`, to ensure a consistent state for internal data structures * During interpreter shutdown, to ensure that daemon threads aren't accessing Python objects This adds the following functions to implement global and per-interpreter pauses: * `_PyEval_StopTheWorldAll()` and `_PyEval_StartTheWorldAll()` (for the global runtime) * `_PyEval_StopTheWorld()` and `_PyEval_StartTheWorld()` (per-interpreter) (The function names may change.) These functions are no-ops outside of the `--disable-gil` build.
* gh-111968: Use per-thread freelists for generator in free-threading (gh-114189)Donghee Na2024-01-181-1/+2
|
* gh-111968: Use per-thread freelists for PyContext in free-threading (gh-114122)Donghee Na2024-01-161-0/+1
|
* gh-111968: Use per-thread slice_cache in free-threading (gh-113972)Donghee Na2024-01-151-0/+1
|
* gh-111968: Use per-thread freelists for tuple in free-threading (gh-113921)Donghee Na2024-01-111-0/+1
|
* gh-111968: Use per-thread freelists for float in free-threading (gh-113886)Donghee Na2024-01-101-0/+1
|
* gh-111968: Introduce _PyFreeListState and _PyFreeListState_GET API (gh-113584)Donghee Na2024-01-091-0/+11
|
* gh-112532: Tag mimalloc heaps and pages (#113742)Sam Gross2024-01-051-2/+2
| | | | | | | | | | | | | | | | | | | | | * gh-112532: Tag mimalloc heaps and pages Mimalloc pages are data structures that contain contiguous allocations of the same block size. Note that they are distinct from operating system pages. Mimalloc pages are contained in segments. When a thread exits, it abandons any segments and contained pages that have live allocations. These segments and pages may be later reclaimed by another thread. To support GC and certain thread-safety guarantees in free-threaded builds, we want pages to only be reclaimed by the corresponding heap in the claimant thread. For example, we want pages containing GC objects to only be claimed by GC heaps. This allows heaps and pages to be tagged with an integer tag that is used to ensure that abandoned pages are only claimed by heaps with the same tag. Heaps can be initialized with a tag (0-15); any page allocated by that heap copies the corresponding tag. * Fix conversion warning
* gh-112532: Isolate abandoned segments by interpreter (#113717)Sam Gross2024-01-041-0/+5
| | | | | | | | | | | | | | | * gh-112532: Isolate abandoned segments by interpreter Mimalloc segments are data structures that contain memory allocations along with metadata. Each segment is "owned" by a thread. When a thread exits, it abandons its segments to a global pool to be later reclaimed by other threads. This changes the pool to be per-interpreter instead of process-wide. This will be important for when we use mimalloc to find GC objects in the `--disable-gil` builds. We want heaps to only store Python objects from a single interpreter. Absent this change, the abandoning and reclaiming process could break this isolation. * Add missing '&_mi_abandoned_default' to 'tld_empty'
* gh-112532: Use separate mimalloc heaps for GC objects (gh-113263)Sam Gross2023-12-261-0/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | * gh-112532: Use separate mimalloc heaps for GC objects In `--disable-gil` builds, we now use four separate heaps in anticipation of using mimalloc to find GC objects when the GIL is disabled. To support this, we also make a few changes to mimalloc: * `mi_heap_t` and `mi_tld_t` initialization is split from allocation. This allows us to have a `mi_tld_t` per-`PyThreadState`, which is important to keep interpreter isolation, since the same OS thread may run in multiple interpreters (using different PyThreadStates.) * Heap abandoning (mi_heap_collect_ex) can now be called from a different thread than the one that created the heap. This is necessary because we may clear and delete the containing PyThreadStates from a different thread during finalization and after fork(). * Use enum instead of defines and guard mimalloc includes. * The enum typedef will be convenient for future PRs that use the type. * Guarding the mimalloc includes allows us to unconditionally include pycore_mimalloc.h from other header files that rely on things like `struct _mimalloc_thread_state`. * Only define _mimalloc_thread_state in Py_GIL_DISABLED builds
* gh-112535: Implement fallback implementation of _Py_ThreadId() (gh-113185)Donghee Na2023-12-181-0/+14
| | | | | --------- Co-authored-by: Sam Gross <colesbury@gmail.com>
* gh-112723: Call `PyThreadState_Clear()` from the correct interpreter (#112776)Sam Gross2023-12-131-4/+3
| | | | | | | | | | | | | The `PyThreadState_Clear()` function must only be called with the GIL held and must be called from the same interpreter as the passed in thread state. Otherwise, any Python objects on the thread state may be destroyed using the wrong interpreter, leading to memory corruption. This is also important for `Py_GIL_DISABLED` builds because free lists will be associated with PyThreadStates and cleared in `PyThreadState_Clear()`. This fixes two places that called `PyThreadState_Clear()` from the wrong interpreter and adds an assertion to `PyThreadState_Clear()`.
* gh-76785: Fixes for test.support.interpreters (gh-112982)Eric Snow2023-12-121-1/+1
| | | This involves a number of changes for PEP 734.
* gh-111924: Use PyMutex for Runtime-global Locks. (gh-112207)Sam Gross2023-12-071-107/+14
| | | | | This replaces some usages of PyThread_type_lock with PyMutex, which does not require memory allocation to initialize. This simplifies some of the runtime initialization and is also one step towards avoiding changing the default raw memory allocator during initialize/finalization, which can be non-thread-safe in some circumstances.
* gh-112538: Add internal-only _PyThreadStateImpl "wrapper" for PyThreadState ↵Sam Gross2023-12-071-14/+14
| | | | | | | | (gh-112560) Every PyThreadState instance is now actually a _PyThreadStateImpl. It is safe to cast from `PyThreadState*` to `_PyThreadStateImpl*` and back. The _PyThreadStateImpl will contain fields that we do not want to expose in the public C API.
* gh-111863: Rename `Py_NOGIL` to `Py_GIL_DISABLED` (#111864)Hugo van Kemenade2023-11-201-2/+2
| | | Rename Py_NOGIL to Py_GIL_DISABLED