summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorEric Snow <ericsnowcurrently@gmail.com>2024-10-16 22:50:46 (GMT)
committerGitHub <noreply@github.com>2024-10-16 22:50:46 (GMT)
commita5a7f5e16d8c3938d266703ea8fba8ffee3e3ae5 (patch)
tree8f35555767598b877e490814eca9363cdf5eea2d /Doc
parenta38fef4439139743e3334c1d69f24cafdf4d71da (diff)
downloadcpython-a5a7f5e16d8c3938d266703ea8fba8ffee3e3ae5.zip
cpython-a5a7f5e16d8c3938d266703ea8fba8ffee3e3ae5.tar.gz
cpython-a5a7f5e16d8c3938d266703ea8fba8ffee3e3ae5.tar.bz2
gh-124694: Add concurrent.futures.InterpreterPoolExecutor (gh-124548)
This is an implementation of InterpreterPoolExecutor that builds on ThreadPoolExecutor. (Note that this is not tied to PEP 734, which is strictly about adding a new stdlib module.) Possible future improvements: * support passing a script for the initializer or to submit() * support passing (most) arbitrary functions without pickling * support passing closures * optionally exec functions against __main__ instead of the their original module
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/asyncio-dev.rst6
-rw-r--r--Doc/library/asyncio-eventloop.rst9
-rw-r--r--Doc/library/asyncio-llapi-index.rst2
-rw-r--r--Doc/library/concurrent.futures.rst135
-rw-r--r--Doc/whatsnew/3.14.rst8
5 files changed, 152 insertions, 8 deletions
diff --git a/Doc/library/asyncio-dev.rst b/Doc/library/asyncio-dev.rst
index a9c3a01..44b507a 100644
--- a/Doc/library/asyncio-dev.rst
+++ b/Doc/library/asyncio-dev.rst
@@ -103,7 +103,8 @@ To handle signals the event loop must be
run in the main thread.
The :meth:`loop.run_in_executor` method can be used with a
-:class:`concurrent.futures.ThreadPoolExecutor` to execute
+:class:`concurrent.futures.ThreadPoolExecutor` or
+:class:`~concurrent.futures.InterpreterPoolExecutor` to execute
blocking code in a different OS thread without blocking the OS thread
that the event loop runs in.
@@ -128,7 +129,8 @@ if a function performs a CPU-intensive calculation for 1 second,
all concurrent asyncio Tasks and IO operations would be delayed
by 1 second.
-An executor can be used to run a task in a different thread or even in
+An executor can be used to run a task in a different thread,
+including in a different interpreter, or even in
a different process to avoid blocking the OS thread with the
event loop. See the :meth:`loop.run_in_executor` method for more
details.
diff --git a/Doc/library/asyncio-eventloop.rst b/Doc/library/asyncio-eventloop.rst
index 943683f..14fd153 100644
--- a/Doc/library/asyncio-eventloop.rst
+++ b/Doc/library/asyncio-eventloop.rst
@@ -1305,6 +1305,12 @@ Executing code in thread or process pools
pool, cpu_bound)
print('custom process pool', result)
+ # 4. Run in a custom interpreter pool:
+ with concurrent.futures.InterpreterPoolExecutor() as pool:
+ result = await loop.run_in_executor(
+ pool, cpu_bound)
+ print('custom interpreter pool', result)
+
if __name__ == '__main__':
asyncio.run(main())
@@ -1329,7 +1335,8 @@ Executing code in thread or process pools
Set *executor* as the default executor used by :meth:`run_in_executor`.
*executor* must be an instance of
- :class:`~concurrent.futures.ThreadPoolExecutor`.
+ :class:`~concurrent.futures.ThreadPoolExecutor`, which includes
+ :class:`~concurrent.futures.InterpreterPoolExecutor`.
.. versionchanged:: 3.11
*executor* must be an instance of
diff --git a/Doc/library/asyncio-llapi-index.rst b/Doc/library/asyncio-llapi-index.rst
index 3e21054..f5af888 100644
--- a/Doc/library/asyncio-llapi-index.rst
+++ b/Doc/library/asyncio-llapi-index.rst
@@ -96,7 +96,7 @@ See also the main documentation section about the
- Invoke a callback *at* the given time.
-.. rubric:: Thread/Process Pool
+.. rubric:: Thread/Interpreter/Process Pool
.. list-table::
:widths: 50 50
:class: full-width-table
diff --git a/Doc/library/concurrent.futures.rst b/Doc/library/concurrent.futures.rst
index ce72127..45a7370 100644
--- a/Doc/library/concurrent.futures.rst
+++ b/Doc/library/concurrent.futures.rst
@@ -15,9 +15,10 @@ The :mod:`concurrent.futures` module provides a high-level interface for
asynchronously executing callables.
The asynchronous execution can be performed with threads, using
-:class:`ThreadPoolExecutor`, or separate processes, using
-:class:`ProcessPoolExecutor`. Both implement the same interface, which is
-defined by the abstract :class:`Executor` class.
+:class:`ThreadPoolExecutor` or :class:`InterpreterPoolExecutor`,
+or separate processes, using :class:`ProcessPoolExecutor`.
+Each implements the same interface, which is defined
+by the abstract :class:`Executor` class.
.. include:: ../includes/wasm-notavail.rst
@@ -63,7 +64,8 @@ Executor Objects
setting *chunksize* to a positive integer. For very long iterables,
using a large value for *chunksize* can significantly improve
performance compared to the default size of 1. With
- :class:`ThreadPoolExecutor`, *chunksize* has no effect.
+ :class:`ThreadPoolExecutor` and :class:`InterpreterPoolExecutor`,
+ *chunksize* has no effect.
.. versionchanged:: 3.5
Added the *chunksize* argument.
@@ -227,6 +229,111 @@ ThreadPoolExecutor Example
print('%r page is %d bytes' % (url, len(data)))
+InterpreterPoolExecutor
+-----------------------
+
+The :class:`InterpreterPoolExecutor` class uses a pool of interpreters
+to execute calls asynchronously. It is a :class:`ThreadPoolExecutor`
+subclass, which means each worker is running in its own thread.
+The difference here is that each worker has its own interpreter,
+and runs each task using that interpreter.
+
+The biggest benefit to using interpreters instead of only threads
+is true multi-core parallelism. Each interpreter has its own
+:term:`Global Interpreter Lock <global interpreter lock>`, so code
+running in one interpreter can run on one CPU core, while code in
+another interpreter runs unblocked on a different core.
+
+The tradeoff is that writing concurrent code for use with multiple
+interpreters can take extra effort. However, this is because it
+forces you to be deliberate about how and when interpreters interact,
+and to be explicit about what data is shared between interpreters.
+This results in several benefits that help balance the extra effort,
+including true multi-core parallelism, For example, code written
+this way can make it easier to reason about concurrency. Another
+major benefit is that you don't have to deal with several of the
+big pain points of using threads, like nrace conditions.
+
+Each worker's interpreter is isolated from all the other interpreters.
+"Isolated" means each interpreter has its own runtime state and
+operates completely independently. For example, if you redirect
+:data:`sys.stdout` in one interpreter, it will not be automatically
+redirected any other interpreter. If you import a module in one
+interpreter, it is not automatically imported in any other. You
+would need to import the module separately in interpreter where
+you need it. In fact, each module imported in an interpreter is
+a completely separate object from the same module in a different
+interpreter, including :mod:`sys`, :mod:`builtins`,
+and even ``__main__``.
+
+Isolation means a mutable object, or other data, cannot be used
+by more than one interpreter at the same time. That effectively means
+interpreters cannot actually share such objects or data. Instead,
+each interpreter must have its own copy, and you will have to
+synchronize any changes between the copies manually. Immutable
+objects and data, like the builtin singletons, strings, and tuples
+of immutable objects, don't have these limitations.
+
+Communicating and synchronizing between interpreters is most effectively
+done using dedicated tools, like those proposed in :pep:`734`. One less
+efficient alternative is to serialize with :mod:`pickle` and then send
+the bytes over a shared :mod:`socket <socket>` or
+:func:`pipe <os.pipe>`.
+
+.. class:: InterpreterPoolExecutor(max_workers=None, thread_name_prefix='', initializer=None, initargs=(), shared=None)
+
+ A :class:`ThreadPoolExecutor` subclass that executes calls asynchronously
+ using a pool of at most *max_workers* threads. Each thread runs
+ tasks in its own interpreter. The worker interpreters are isolated
+ from each other, which means each has its own runtime state and that
+ they can't share any mutable objects or other data. Each interpreter
+ has its own :term:`Global Interpreter Lock <global interpreter lock>`,
+ which means code run with this executor has true multi-core parallelism.
+
+ The optional *initializer* and *initargs* arguments have the same
+ meaning as for :class:`!ThreadPoolExecutor`: the initializer is run
+ when each worker is created, though in this case it is run.in
+ the worker's interpreter. The executor serializes the *initializer*
+ and *initargs* using :mod:`pickle` when sending them to the worker's
+ interpreter.
+
+ .. note::
+ Functions defined in the ``__main__`` module cannot be pickled
+ and thus cannot be used.
+
+ .. note::
+ The executor may replace uncaught exceptions from *initializer*
+ with :class:`~concurrent.futures.interpreter.ExecutionFailed`.
+
+ The optional *shared* argument is a :class:`dict` of objects that all
+ interpreters in the pool share. The *shared* items are added to each
+ interpreter's ``__main__`` module. Not all objects are shareable.
+ Shareable objects include the builtin singletons, :class:`str`
+ and :class:`bytes`, and :class:`memoryview`. See :pep:`734`
+ for more info.
+
+ Other caveats from parent :class:`ThreadPoolExecutor` apply here.
+
+:meth:`~Executor.submit` and :meth:`~Executor.map` work like normal,
+except the worker serializes the callable and arguments using
+:mod:`pickle` when sending them to its interpreter. The worker
+likewise serializes the return value when sending it back.
+
+.. note::
+ Functions defined in the ``__main__`` module cannot be pickled
+ and thus cannot be used.
+
+When a worker's current task raises an uncaught exception, the worker
+always tries to preserve the exception as-is. If that is successful
+then it also sets the ``__cause__`` to a corresponding
+:class:`~concurrent.futures.interpreter.ExecutionFailed`
+instance, which contains a summary of the original exception.
+In the uncommon case that the worker is not able to preserve the
+original as-is then it directly preserves the corresponding
+:class:`~concurrent.futures.interpreter.ExecutionFailed`
+instance instead.
+
+
ProcessPoolExecutor
-------------------
@@ -574,6 +681,26 @@ Exception classes
.. versionadded:: 3.7
+.. currentmodule:: concurrent.futures.interpreter
+
+.. exception:: BrokenInterpreterPool
+
+ Derived from :exc:`~concurrent.futures.thread.BrokenThreadPool`,
+ this exception class is raised when one of the workers
+ of a :class:`~concurrent.futures.InterpreterPoolExecutor`
+ has failed initializing.
+
+ .. versionadded:: next
+
+.. exception:: ExecutionFailed
+
+ Raised from :class:`~concurrent.futures.InterpreterPoolExecutor` when
+ the given initializer fails or from
+ :meth:`~concurrent.futures.Executor.submit` when there's an uncaught
+ exception from the submitted task.
+
+ .. versionadded:: next
+
.. currentmodule:: concurrent.futures.process
.. exception:: BrokenProcessPool
diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst
index b106578..9543af3 100644
--- a/Doc/whatsnew/3.14.rst
+++ b/Doc/whatsnew/3.14.rst
@@ -225,6 +225,14 @@ ast
* The ``repr()`` output for AST nodes now includes more information.
(Contributed by Tomas R in :gh:`116022`.)
+concurrent.futures
+------------------
+
+* Add :class:`~concurrent.futures.InterpreterPoolExecutor`,
+ which exposes "subinterpreters (multiple Python interpreters in the
+ same process) to Python code. This is separate from the proposed API
+ in :pep:`734`.
+ (Contributed by Eric Snow in :gh:`124548`.)
ctypes
------