From bedd2c2d880783dfdab357ff285e4f308db6a65b Mon Sep 17 00:00:00 2001 From: Antoine Pitrou Date: Sat, 15 Jan 2011 12:54:19 +0000 Subject: Reword and restructure the GIL API doc --- Doc/c-api/init.rst | 308 +++++++++++++++++++++++++++-------------------------- 1 file changed, 156 insertions(+), 152 deletions(-) diff --git a/Doc/c-api/init.rst b/Doc/c-api/init.rst index 8d793a4..f920909 100644 --- a/Doc/c-api/init.rst +++ b/Doc/c-api/init.rst @@ -366,48 +366,47 @@ Thread State and the Global Interpreter Lock single: lock, interpreter The Python interpreter is not fully thread-safe. In order to support -multi-threaded Python programs, there's a global lock, called the :dfn:`global -interpreter lock` or :dfn:`GIL`, that must be held by the current thread before +multi-threaded Python programs, there's a global lock, called the :term:`global +interpreter lock` or :term:`GIL`, that must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice. -.. index:: single: setcheckinterval() (in module sys) +.. index:: single: setswitchinterval() (in module sys) -Therefore, the rule exists that only the thread that has acquired the global -interpreter lock may operate on Python objects or call Python/C API functions. -In order to support multi-threaded Python programs, the interpreter regularly -releases and reacquires the lock --- by default, every 100 bytecode instructions -(this can be changed with :func:`sys.setcheckinterval`). The lock is also -released and reacquired around potentially blocking I/O operations like reading -or writing a file, so that other threads can run while the thread that requests -the I/O is waiting for the I/O operation to complete. +Therefore, the rule exists that only the thread that has acquired the +:term:`GIL` may operate on Python objects or call Python/C API functions. +In order to emulate concurrency of execution, the interpreter regularly +tries to switch threads (see :func:`sys.setswitchinterval`). The lock is also +released around potentially blocking I/O operations like reading or writing +a file, so that other Python threads can run in the meantime. .. index:: single: PyThreadState single: PyThreadState -The Python interpreter needs to keep some bookkeeping information separate per -thread --- for this it uses a data structure called :c:type:`PyThreadState`. -There's one global variable, however: the pointer to the current -:c:type:`PyThreadState` structure. Before the addition of :dfn:`thread-local -storage` (:dfn:`TLS`) the current thread state had to be manipulated -explicitly. +The Python interpreter keeps some thread-specific bookkeeping information +inside a data structure called :c:type:`PyThreadState`. There's also one +global variable pointing to the current :c:type:`PyThreadState`: it can +be retrieved using :c:func:`PyThreadState_Get`. + +Releasing the GIL from extension code +------------------------------------- -This is easy enough in most cases. Most code manipulating the global -interpreter lock has the following simple structure:: +Most extension code manipulating the :term:`GIL` has the following simple +structure:: Save the thread state in a local variable. Release the global interpreter lock. - ...Do some blocking I/O operation... + ... Do some blocking I/O operation ... Reacquire the global interpreter lock. Restore the thread state from the local variable. This is so common that a pair of macros exists to simplify it:: Py_BEGIN_ALLOW_THREADS - ...Do some blocking I/O operation... + ... Do some blocking I/O operation ... Py_END_ALLOW_THREADS .. index:: @@ -416,9 +415,8 @@ This is so common that a pair of macros exists to simplify it:: The :c:macro:`Py_BEGIN_ALLOW_THREADS` macro opens a new block and declares a hidden local variable; the :c:macro:`Py_END_ALLOW_THREADS` macro closes the -block. Another advantage of using these two macros is that when Python is -compiled without thread support, they are defined empty, thus saving the thread -state and GIL manipulations. +block. These two macros are still available when Python is compiled without +thread support (they simply have an empty expansion). When thread support is enabled, the block above expands to the following code:: @@ -428,65 +426,60 @@ When thread support is enabled, the block above expands to the following code:: ...Do some blocking I/O operation... PyEval_RestoreThread(_save); -Using even lower level primitives, we can get roughly the same effect as -follows:: - - PyThreadState *_save; - - _save = PyThreadState_Swap(NULL); - PyEval_ReleaseLock(); - ...Do some blocking I/O operation... - PyEval_AcquireLock(); - PyThreadState_Swap(_save); - .. index:: single: PyEval_RestoreThread() - single: errno single: PyEval_SaveThread() - single: PyEval_ReleaseLock() - single: PyEval_AcquireLock() - -There are some subtle differences; in particular, :c:func:`PyEval_RestoreThread` -saves and restores the value of the global variable :c:data:`errno`, since the -lock manipulation does not guarantee that :c:data:`errno` is left alone. Also, -when thread support is disabled, :c:func:`PyEval_SaveThread` and -:c:func:`PyEval_RestoreThread` don't manipulate the GIL; in this case, -:c:func:`PyEval_ReleaseLock` and :c:func:`PyEval_AcquireLock` are not available. -This is done so that dynamically loaded extensions compiled with thread support -enabled can be loaded by an interpreter that was compiled with disabled thread -support. - -The global interpreter lock is used to protect the pointer to the current thread -state. When releasing the lock and saving the thread state, the current thread -state pointer must be retrieved before the lock is released (since another -thread could immediately acquire the lock and store its own thread state in the -global variable). Conversely, when acquiring the lock and restoring the thread -state, the lock must be acquired before storing the thread state pointer. - -It is important to note that when threads are created from C, they don't have -the global interpreter lock, nor is there a thread state data structure for -them. Such threads must bootstrap themselves into existence, by first -creating a thread state data structure, then acquiring the lock, and finally -storing their thread state pointer, before they can start using the Python/C -API. When they are done, they should reset the thread state pointer, release -the lock, and finally free their thread state data structure. - -Threads can take advantage of the :c:func:`PyGILState_\*` functions to do all of -the above automatically. The typical idiom for calling into Python from a C -thread is now:: + +Here is how these functions work: the global interpreter lock is used to protect the pointer to the +current thread state. When releasing the lock and saving the thread state, +the current thread state pointer must be retrieved before the lock is released +(since another thread could immediately acquire the lock and store its own thread +state in the global variable). Conversely, when acquiring the lock and restoring +the thread state, the lock must be acquired before storing the thread state +pointer. + +.. note:: + Calling system I/O functions is the most common use case for releasing + the GIL, but it can also be useful before calling long-running computations + which don't need access to Python objects, such as compression or + cryptographic functions operating over memory buffers. For example, the + standard :mod:`zlib` and :mod:`hashlib` modules release the GIL when + compressing or hashing data. + +Non-Python created threads +-------------------------- + +When threads are created using the dedicated Python APIs (such as the +:mod:`threading` module), a thread state is automatically associated to them +and the code showed above is therefore correct. However, when threads are +created from C (for example by a third-party library with its own thread +management), they don't hold the GIL, nor is there a thread state structure +for them. + +If you need to call Python code from these threads (often this will be part +of a callback API provided by the aforementioned third-party library), +you must first register these threads with the interpreter by +creating a thread state data structure, then acquiring the GIL, and finally +storing their thread state pointer, before you can start using the Python/C +API. When you are done, you should reset the thread state pointer, release +the GIL, and finally free the thread state data structure. + +The :c:func:`PyGILState_Ensure` and :c:func:`PyGILState_Release` functions do +all of the above automatically. The typical idiom for calling into Python +from a C thread is:: PyGILState_STATE gstate; gstate = PyGILState_Ensure(); - /* Perform Python actions here. */ + /* Perform Python actions here. */ result = CallSomeFunction(); - /* evaluate result */ + /* evaluate result or handle exception */ /* Release the thread. No Python API allowed beyond this point. */ PyGILState_Release(gstate); Note that the :c:func:`PyGILState_\*` functions assume there is only one global -interpreter (created automatically by :c:func:`Py_Initialize`). Python still +interpreter (created automatically by :c:func:`Py_Initialize`). Python supports the creation of additional interpreters (using :c:func:`Py_NewInterpreter`), but mixing multiple interpreters and the :c:func:`PyGILState_\*` API is unsupported. @@ -509,6 +502,12 @@ being held by a thread that is defunct after the fork. always able to. +High-level API +-------------- + +These are the most commonly used types and functions when writing C extension +code, or when embedding the Python interpreter: + .. c:type:: PyInterpreterState This data structure represents the state shared by a number of cooperating @@ -550,21 +549,22 @@ always able to. .. index:: module: _thread - When only the main thread exists, no GIL operations are needed. This is a - common situation (most Python programs do not use threads), and the lock - operations slow the interpreter down a bit. Therefore, the lock is not - created initially. This situation is equivalent to having acquired the lock: - when there is only a single thread, all object accesses are safe. Therefore, - when this function initializes the global interpreter lock, it also acquires - it. Before the Python :mod:`_thread` module creates a new thread, knowing - that either it has the lock or the lock hasn't been created yet, it calls - :c:func:`PyEval_InitThreads`. When this call returns, it is guaranteed that - the lock has been created and that the calling thread has acquired it. + .. note:: + When only the main thread exists, no GIL operations are needed. This is a + common situation (most Python programs do not use threads), and the lock + operations slow the interpreter down a bit. Therefore, the lock is not + created initially. This situation is equivalent to having acquired the lock: + when there is only a single thread, all object accesses are safe. Therefore, + when this function initializes the global interpreter lock, it also acquires + it. Before the Python :mod:`_thread` module creates a new thread, knowing + that either it has the lock or the lock hasn't been created yet, it calls + :c:func:`PyEval_InitThreads`. When this call returns, it is guaranteed that + the lock has been created and that the calling thread has acquired it. - It is **not** safe to call this function when it is unknown which thread (if - any) currently has the global interpreter lock. + It is **not** safe to call this function when it is unknown which thread (if + any) currently has the global interpreter lock. - This function is not available when thread support is disabled at compile time. + This function is not available when thread support is disabled at compile time. .. c:function:: int PyEval_ThreadsInitialized() @@ -575,37 +575,6 @@ always able to. not available when thread support is disabled at compile time. -.. c:function:: void PyEval_AcquireLock() - - Acquire the global interpreter lock. The lock must have been created earlier. - If this thread already has the lock, a deadlock ensues. This function is not - available when thread support is disabled at compile time. - - -.. c:function:: void PyEval_ReleaseLock() - - Release the global interpreter lock. The lock must have been created earlier. - This function is not available when thread support is disabled at compile time. - - -.. c:function:: void PyEval_AcquireThread(PyThreadState *tstate) - - Acquire the global interpreter lock and set the current thread state to - *tstate*, which should not be *NULL*. The lock must have been created earlier. - If this thread already has the lock, deadlock ensues. This function is not - available when thread support is disabled at compile time. - - -.. c:function:: void PyEval_ReleaseThread(PyThreadState *tstate) - - Reset the current thread state to *NULL* and release the global interpreter - lock. The lock must have been created earlier and must be held by the current - thread. The *tstate* argument, which must not be *NULL*, is only used to check - that it represents the current thread state --- if it isn't, a fatal error is - reported. This function is not available when thread support is disabled at - compile time. - - .. c:function:: PyThreadState* PyEval_SaveThread() Release the global interpreter lock (if it has been created and thread @@ -624,6 +593,20 @@ always able to. when thread support is disabled at compile time.) +.. c:function:: PyThreadState* PyThreadState_Get() + + Return the current thread state. The global interpreter lock must be held. + When the current thread state is *NULL*, this issues a fatal error (so that + the caller needn't check for *NULL*). + + +.. c:function:: PyThreadState* PyThreadState_Swap(PyThreadState *tstate) + + Swap the current thread state with the thread state given by the argument + *tstate*, which may be *NULL*. The global interpreter lock must be held + and is not released. + + .. c:function:: void PyEval_ReInitThreads() This function is called from :c:func:`PyOS_AfterFork` to ensure that newly @@ -631,6 +614,43 @@ always able to. are not running in the child process. +The following functions use thread-local storage, and are not compatible +with sub-interpreters: + +.. c:function:: PyGILState_STATE PyGILState_Ensure() + + Ensure that the current thread is ready to call the Python C API regardless + of the current state of Python, or of the global interpreter lock. This may + be called as many times as desired by a thread as long as each call is + matched with a call to :c:func:`PyGILState_Release`. In general, other + thread-related APIs may be used between :c:func:`PyGILState_Ensure` and + :c:func:`PyGILState_Release` calls as long as the thread state is restored to + its previous state before the Release(). For example, normal usage of the + :c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS` macros is + acceptable. + + The return value is an opaque "handle" to the thread state when + :c:func:`PyGILState_Ensure` was called, and must be passed to + :c:func:`PyGILState_Release` to ensure Python is left in the same state. Even + though recursive calls are allowed, these handles *cannot* be shared - each + unique call to :c:func:`PyGILState_Ensure` must save the handle for its call + to :c:func:`PyGILState_Release`. + + When the function returns, the current thread will hold the GIL and be able + to call arbitrary Python code. Failure is a fatal error. + + +.. c:function:: void PyGILState_Release(PyGILState_STATE) + + Release any resources previously acquired. After this call, Python's state will + be the same as it was prior to the corresponding :c:func:`PyGILState_Ensure` call + (but generally this state will be unknown to the caller, hence the use of the + GILState API). + + Every call to :c:func:`PyGILState_Ensure` must be matched by a call to + :c:func:`PyGILState_Release` on the same thread. + + The following macros are normally used without a trailing semicolon; look for example usage in the Python source distribution. @@ -664,6 +684,10 @@ example usage in the Python source distribution. :c:macro:`Py_BEGIN_ALLOW_THREADS` without the opening brace and variable declaration. It is a no-op when thread support is disabled at compile time. + +Low-level API +------------- + All of the following functions are only available when thread support is enabled at compile time, and must be called only when the global interpreter lock has been created. @@ -709,19 +733,6 @@ been created. :c:func:`PyThreadState_Clear`. -.. c:function:: PyThreadState* PyThreadState_Get() - - Return the current thread state. The global interpreter lock must be held. - When the current thread state is *NULL*, this issues a fatal error (so that - the caller needn't check for *NULL*). - - -.. c:function:: PyThreadState* PyThreadState_Swap(PyThreadState *tstate) - - Swap the current thread state with the thread state given by the argument - *tstate*, which may be *NULL*. The global interpreter lock must be held. - - .. c:function:: PyObject* PyThreadState_GetDict() Return a dictionary in which extensions can store thread-specific state @@ -742,38 +753,31 @@ been created. exception (if any) for the thread is cleared. This raises no exceptions. -.. c:function:: PyGILState_STATE PyGILState_Ensure() +.. c:function:: void PyEval_AcquireThread(PyThreadState *tstate) - Ensure that the current thread is ready to call the Python C API regardless - of the current state of Python, or of the global interpreter lock. This may - be called as many times as desired by a thread as long as each call is - matched with a call to :c:func:`PyGILState_Release`. In general, other - thread-related APIs may be used between :c:func:`PyGILState_Ensure` and - :c:func:`PyGILState_Release` calls as long as the thread state is restored to - its previous state before the Release(). For example, normal usage of the - :c:macro:`Py_BEGIN_ALLOW_THREADS` and :c:macro:`Py_END_ALLOW_THREADS` macros is - acceptable. + Acquire the global interpreter lock and set the current thread state to + *tstate*, which should not be *NULL*. The lock must have been created earlier. + If this thread already has the lock, deadlock ensues. - The return value is an opaque "handle" to the thread state when - :c:func:`PyGILState_Ensure` was called, and must be passed to - :c:func:`PyGILState_Release` to ensure Python is left in the same state. Even - though recursive calls are allowed, these handles *cannot* be shared - each - unique call to :c:func:`PyGILState_Ensure` must save the handle for its call - to :c:func:`PyGILState_Release`. - When the function returns, the current thread will hold the GIL. Failure is a - fatal error. +.. c:function:: void PyEval_ReleaseThread(PyThreadState *tstate) + Reset the current thread state to *NULL* and release the global interpreter + lock. The lock must have been created earlier and must be held by the current + thread. The *tstate* argument, which must not be *NULL*, is only used to check + that it represents the current thread state --- if it isn't, a fatal error is + reported. -.. c:function:: void PyGILState_Release(PyGILState_STATE) - Release any resources previously acquired. After this call, Python's state will - be the same as it was prior to the corresponding :c:func:`PyGILState_Ensure` call - (but generally this state will be unknown to the caller, hence the use of the - GILState API.) +.. c:function:: void PyEval_AcquireLock() - Every call to :c:func:`PyGILState_Ensure` must be matched by a call to - :c:func:`PyGILState_Release` on the same thread. + Acquire the global interpreter lock. The lock must have been created earlier. + If this thread already has the lock, a deadlock ensues. + + +.. c:function:: void PyEval_ReleaseLock() + + Release the global interpreter lock. The lock must have been created earlier. Sub-interpreter support -- cgit v0.12