summaryrefslogtreecommitdiffstats
path: root/Doc/library/multiprocessing.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/multiprocessing.rst')
-rw-r--r--Doc/library/multiprocessing.rst301
1 files changed, 215 insertions, 86 deletions
diff --git a/Doc/library/multiprocessing.rst b/Doc/library/multiprocessing.rst
index b2ed544..f3197eb 100644
--- a/Doc/library/multiprocessing.rst
+++ b/Doc/library/multiprocessing.rst
@@ -93,11 +93,108 @@ To show the individual process IDs involved, here is an expanded example::
p.start()
p.join()
-For an explanation of why (on Windows) the ``if __name__ == '__main__'`` part is
+For an explanation of why the ``if __name__ == '__main__'`` part is
necessary, see :ref:`multiprocessing-programming`.
+Contexts and start methods
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Depending on the platform, :mod:`multiprocessing` supports three ways
+to start a process. These *start methods* are
+
+ *spawn*
+ The parent process starts a fresh python interpreter process. The
+ child process will only inherit those resources necessary to run
+ the process objects :meth:`~Process.run` method. In particular,
+ unnecessary file descriptors and handles from the parent process
+ will not be inherited. Starting a process using this method is
+ rather slow compared to using *fork* or *forkserver*.
+
+ Available on Unix and Windows. The default on Windows.
+
+ *fork*
+ The parent process uses :func:`os.fork` to fork the Python
+ interpreter. The child process, when it begins, is effectively
+ identical to the parent process. All resources of the parent are
+ inherited by the child process. Note that safely forking a
+ multithreaded process is problematic.
+
+ Available on Unix only. The default on Unix.
+
+ *forkserver*
+ When the program starts and selects the *forkserver* start method,
+ a server process is started. From then on, whenever a new process
+ is needed, the parent process connects to the server and requests
+ that it fork a new process. The fork server process is single
+ threaded so it is safe for it to use :func:`os.fork`. No
+ unnecessary resources are inherited.
+
+ Available on Unix platforms which support passing file descriptors
+ over Unix pipes.
+
+Before Python 3.4 *fork* was the only option available on Unix. Also,
+prior to Python 3.4, child processes would inherit all the parents
+inheritable handles on Windows.
+
+On Unix using the *spawn* or *forkserver* start methods will also
+start a *semaphore tracker* process which tracks the unlinked named
+semaphores created by processes of the program. When all processes
+have exited the semaphore tracker unlinks any remaining semaphores.
+Usually there should be none, but if a process was killed by a signal
+there may some "leaked" semaphores. (Unlinking the named semaphores
+is a serious matter since the system allows only a limited number, and
+they will not be automatically unlinked until the next reboot.)
+
+To select the a start method you use the :func:`set_start_method` in
+the ``if __name__ == '__main__'`` clause of the main module. For
+example::
+
+ import multiprocessing as mp
+
+ def foo(q):
+ q.put('hello')
+
+ if __name__ == '__main__':
+ mp.set_start_method('spawn')
+ q = mp.Queue()
+ p = mp.Process(target=foo, args=(q,))
+ p.start()
+ print(q.get())
+ p.join()
+
+:func:`set_start_method` should not be used more than once in the
+program.
+
+Alternatively, you can use :func:`get_context` to obtain a context
+object. Context objects have the same API as the multiprocessing
+module, and allow one to use multiple start methods in the same
+program. ::
+
+ import multiprocessing as mp
+
+ def foo(q):
+ q.put('hello')
+
+ if __name__ == '__main__':
+ ctx = mp.get_context('spawn')
+ q = ctx.Queue()
+ p = ctx.Process(target=foo, args=(q,))
+ p.start()
+ print(q.get())
+ p.join()
+
+Note that objects related to one context may not be compatible with
+processes for a different context. In particular, locks created using
+the *fork* context cannot be passed to a processes started using the
+*spawn* or *forkserver* start methods.
+
+A library which wants to use a particular start method should probably
+use :func:`get_context` to avoid interfering with the choice of the
+library user.
+
+
Exchanging objects between processes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -274,15 +371,31 @@ processes in a few different ways.
For example::
from multiprocessing import Pool
+ from time import sleep
def f(x):
return x*x
if __name__ == '__main__':
- with Pool(processes=4) as pool: # start 4 worker processes
- result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
- print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow
- print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]"
+ # start 4 worker processes
+ with Pool(processes=4) as pool:
+
+ # print "[0, 1, 4,..., 81]"
+ print(pool.map(f, range(10)))
+
+ # print same numbers in arbitrary order
+ for i in pool.imap_unordered(f, range(10)):
+ print(i)
+
+ # evaluate "f(10)" asynchronously
+ res = pool.apply_async(f, [10])
+ print(res.get(timeout=1)) # prints "100"
+
+ # make worker sleep for 10 secs
+ res = pool.apply_async(sleep, 10)
+ print(res.get(timeout=1)) # raises multiprocessing.TimeoutError
+
+ # exiting the 'with'-block has stopped the pool
Note that the methods of a pool should only ever be used by the
process which created it.
@@ -731,6 +844,9 @@ Miscellaneous
Return the number of CPUs in the system. May raise
:exc:`NotImplementedError`.
+ .. seealso::
+ :func:`os.cpu_count`
+
.. function:: current_process()
Return the :class:`Process` object corresponding to the current process.
@@ -761,6 +877,43 @@ Miscellaneous
If the module is being run normally by the Python interpreter then
:func:`freeze_support` has no effect.
+.. function:: get_all_start_methods()
+
+ Returns a list of the supported start methods, the first of which
+ is the default. The possible start methods are ``'fork'``,
+ ``'spawn'`` and ``'forkserver'``. On Windows only ``'spawn'`` is
+ available. On Unix ``'fork'`` and ``'spawn'`` are always
+ supported, with ``'fork'`` being the default.
+
+ .. versionadded:: 3.4
+
+.. function:: get_context(method=None)
+
+ Return a context object which has the same attributes as the
+ :mod:`multiprocessing` module.
+
+ If *method* is *None* then the default context is returned.
+ Otherwise *method* should be ``'fork'``, ``'spawn'``,
+ ``'forkserver'``. :exc:`ValueError` is raised if the specified
+ start method is not available.
+
+ .. versionadded:: 3.4
+
+.. function:: get_start_method(allow_none=False)
+
+ Return the name of start method used for starting processes.
+
+ If the start method has not been fixed and *allow_none* is false,
+ then the start method is fixed to the default and the name is
+ returned. If the start method has not been fixed and *allow_none*
+ is true then *None* is returned.
+
+ The return value can be ``'fork'``, ``'spawn'``, ``'forkserver'``
+ or *None*. ``'fork'`` is the default on Unix, while ``'spawn'`` is
+ the default on Windows.
+
+ .. versionadded:: 3.4
+
.. function:: set_executable()
Sets the path of the Python interpreter to use when starting a child process.
@@ -769,8 +922,21 @@ Miscellaneous
set_executable(os.path.join(sys.exec_prefix, 'pythonw.exe'))
- before they can create child processes. (Windows only)
+ before they can create child processes.
+
+ .. versionchanged:: 3.4
+ Now supported on Unix when the ``'spawn'`` start method is used.
+
+.. function:: set_start_method(method)
+
+ Set the method which should be used to start child processes.
+ *method* can be ``'fork'``, ``'spawn'`` or ``'forkserver'``.
+
+ Note that this should be called at most once, and it should be
+ protected inside the ``if __name__ == '__main__'`` clause of the
+ main module.
+ .. versionadded:: 3.4
.. note::
@@ -1666,14 +1832,14 @@ Process Pools
One can create a pool of processes which will carry out tasks submitted to it
with the :class:`Pool` class.
-.. class:: Pool([processes[, initializer[, initargs[, maxtasksperchild]]]])
+.. class:: Pool([processes[, initializer[, initargs[, maxtasksperchild [, context]]]]])
A process pool object which controls a pool of worker processes to which jobs
can be submitted. It supports asynchronous results with timeouts and
callbacks and has a parallel map implementation.
*processes* is the number of worker processes to use. If *processes* is
- ``None`` then the number returned by :func:`cpu_count` is used. If
+ ``None`` then the number returned by :func:`os.cpu_count` is used. If
*initializer* is not ``None`` then each worker process will call
``initializer(*initargs)`` when it starts.
@@ -1686,6 +1852,13 @@ with the :class:`Pool` class.
unused resources to be freed. The default *maxtasksperchild* is None, which
means worker processes will live as long as the pool.
+ .. versionadded:: 3.4
+ *context* can be used to specify the context used for starting
+ the worker processes. Usually a pool is created using the
+ function :func:`multiprocessing.Pool` or the :meth:`Pool` method
+ of a context object. In both cases *context* is set
+ appropriately.
+
.. note::
Worker processes within a :class:`Pool` typically live for the complete
@@ -2177,43 +2350,8 @@ Below is an example session with logging turned on::
[INFO/MainProcess] sending shutdown message to manager
[INFO/SyncManager-...] manager exiting with exitcode 0
-In addition to having these two logging functions, the multiprocessing also
-exposes two additional logging level attributes. These are :const:`SUBWARNING`
-and :const:`SUBDEBUG`. The table below illustrates where theses fit in the
-normal level hierarchy.
-
-+----------------+----------------+
-| Level | Numeric value |
-+================+================+
-| ``SUBWARNING`` | 25 |
-+----------------+----------------+
-| ``SUBDEBUG`` | 5 |
-+----------------+----------------+
-
For a full table of logging levels, see the :mod:`logging` module.
-These additional logging levels are used primarily for certain debug messages
-within the multiprocessing module. Below is the same example as above, except
-with :const:`SUBDEBUG` enabled::
-
- >>> import multiprocessing, logging
- >>> logger = multiprocessing.log_to_stderr()
- >>> logger.setLevel(multiprocessing.SUBDEBUG)
- >>> logger.warning('doomed')
- [WARNING/MainProcess] doomed
- >>> m = multiprocessing.Manager()
- [INFO/SyncManager-...] child process calling self.run()
- [INFO/SyncManager-...] created temp directory /.../pymp-...
- [INFO/SyncManager-...] manager serving at '/.../pymp-djGBXN/listener-...'
- >>> del m
- [SUBDEBUG/MainProcess] finalizer calling ...
- [INFO/MainProcess] sending shutdown message to manager
- [DEBUG/SyncManager-...] manager received shutdown message
- [SUBDEBUG/SyncManager-...] calling <Finalize object, callback=unlink, ...
- [SUBDEBUG/SyncManager-...] finalizer calling <built-in function unlink> ...
- [SUBDEBUG/SyncManager-...] calling <Finalize object, dead>
- [SUBDEBUG/SyncManager-...] finalizer calling <function rmtree at 0x5aa730> ...
- [INFO/SyncManager-...] manager exiting with exitcode 0
The :mod:`multiprocessing.dummy` module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -2234,8 +2372,10 @@ There are certain guidelines and idioms which should be adhered to when using
:mod:`multiprocessing`.
-All platforms
-~~~~~~~~~~~~~
+All start methods
+~~~~~~~~~~~~~~~~~
+
+The following applies to all start methods.
Avoid shared state
@@ -2269,11 +2409,13 @@ Joining zombie processes
Better to inherit than pickle/unpickle
- On Windows many types from :mod:`multiprocessing` need to be picklable so
- that child processes can use them. However, one should generally avoid
- sending shared objects to other processes using pipes or queues. Instead
- you should arrange the program so that a process which needs access to a
- shared resource created elsewhere can inherit it from an ancestor process.
+ When using the *spawn* or *forkserver* start methods many types
+ from :mod:`multiprocessing` need to be picklable so that child
+ processes can use them. However, one should generally avoid
+ sending shared objects to other processes using pipes or queues.
+ Instead you should arrange the program so that a process which
+ needs access to a shared resource created elsewhere can inherit it
+ from an ancestor process.
Avoid terminating processes
@@ -2320,15 +2462,17 @@ Joining processes that use queues
Explicitly pass resources to child processes
- On Unix a child process can make use of a shared resource created in a
- parent process using a global resource. However, it is better to pass the
- object as an argument to the constructor for the child process.
+ On Unix using the *fork* start method, a child process can make
+ use of a shared resource created in a parent process using a
+ global resource. However, it is better to pass the object as an
+ argument to the constructor for the child process.
- Apart from making the code (potentially) compatible with Windows this also
- ensures that as long as the child process is still alive the object will not
- be garbage collected in the parent process. This might be important if some
- resource is freed when the object is garbage collected in the parent
- process.
+ Apart from making the code (potentially) compatible with Windows
+ and the other start methods this also ensures that as long as the
+ child process is still alive the object will not be garbage
+ collected in the parent process. This might be important if some
+ resource is freed when the object is garbage collected in the
+ parent process.
So for instance ::
@@ -2387,17 +2531,19 @@ Beware of replacing :data:`sys.stdin` with a "file like object"
For more information, see :issue:`5155`, :issue:`5313` and :issue:`5331`
-Windows
-~~~~~~~
+The *spawn* and *forkserver* start methods
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Since Windows lacks :func:`os.fork` it has a few extra restrictions:
+There are a few extra restriction which don't apply to the *fork*
+start method.
More picklability
- Ensure that all arguments to :meth:`Process.__init__` are picklable. This
- means, in particular, that bound or unbound methods cannot be used directly
- as the ``target`` argument on Windows --- just define a function and use
- that instead.
+ Ensure that all arguments to :meth:`Process.__init__` are
+ picklable. This means, in particular, that bound or unbound
+ methods cannot be used directly as the ``target`` (unless you use
+ the *fork* start method) --- just define a function and use that
+ instead.
Also, if you subclass :class:`~multiprocessing.Process` then make sure that
instances will be picklable when the :meth:`Process.start
@@ -2419,7 +2565,8 @@ Safe importing of main module
interpreter without causing unintended side effects (such a starting a new
process).
- For example, under Windows running the following module would fail with a
+ For example, using the *spawn* or *forkserver* start method
+ running the following module would fail with a
:exc:`RuntimeError`::
from multiprocessing import Process
@@ -2433,13 +2580,14 @@ Safe importing of main module
Instead one should protect the "entry point" of the program by using ``if
__name__ == '__main__':`` as follows::
- from multiprocessing import Process, freeze_support
+ from multiprocessing import Process, freeze_support, set_start_method
def foo():
print('hello')
if __name__ == '__main__':
freeze_support()
+ set_start_method('spawn')
p = Process(target=foo)
p.start()
@@ -2470,26 +2618,7 @@ Using :class:`~multiprocessing.pool.Pool`:
:language: python3
-Synchronization types like locks, conditions and queues:
-
-.. literalinclude:: ../includes/mp_synchronize.py
- :language: python3
-
-
An example showing how to use queues to feed tasks to a collection of worker
processes and collect the results:
.. literalinclude:: ../includes/mp_workers.py
-
-
-An example of how a pool of worker processes can each run a
-:class:`~http.server.SimpleHTTPRequestHandler` instance while sharing a single
-listening socket.
-
-.. literalinclude:: ../includes/mp_webserver.py
-
-
-Some simple benchmarks comparing :mod:`multiprocessing` with :mod:`threading`:
-
-.. literalinclude:: ../includes/mp_benchmarks.py
-