summaryrefslogtreecommitdiffstats
path: root/Doc/library/multiprocessing.rst
diff options
context:
space:
mode:
authorRichard Oudkerk <shibturn@gmail.com>2013-08-14 14:35:41 (GMT)
committerRichard Oudkerk <shibturn@gmail.com>2013-08-14 14:35:41 (GMT)
commit84ed9a68bd9a13252b376b21a9167dabae254325 (patch)
treeec8daa39fcf64b658bddf52f56ae47c0bdc2b091 /Doc/library/multiprocessing.rst
parentd06eeb4a2492b59d34ab69a2046dcae1f10ec593 (diff)
downloadcpython-84ed9a68bd9a13252b376b21a9167dabae254325.zip
cpython-84ed9a68bd9a13252b376b21a9167dabae254325.tar.gz
cpython-84ed9a68bd9a13252b376b21a9167dabae254325.tar.bz2
Issue #8713: Support alternative start methods in multiprocessing on Unix.
See http://hg.python.org/sandbox/sbt#spawn
Diffstat (limited to 'Doc/library/multiprocessing.rst')
-rw-r--r--Doc/library/multiprocessing.rst240
1 files changed, 156 insertions, 84 deletions
diff --git a/Doc/library/multiprocessing.rst b/Doc/library/multiprocessing.rst
index f659eac..f58c308 100644
--- a/Doc/library/multiprocessing.rst
+++ b/Doc/library/multiprocessing.rst
@@ -93,11 +93,80 @@ To show the individual process IDs involved, here is an expanded example::
p.start()
p.join()
-For an explanation of why (on Windows) the ``if __name__ == '__main__'`` part is
+For an explanation of why the ``if __name__ == '__main__'`` part is
necessary, see :ref:`multiprocessing-programming`.
+Start methods
+~~~~~~~~~~~~~
+
+Depending on the platform, :mod:`multiprocessing` supports three ways
+to start a process. These *start methods* are
+
+ *spawn*
+ The parent process starts a fresh python interpreter process. The
+ child process will only inherit those resources necessary to run
+ the process objects :meth:`~Process.run` method. In particular,
+ unnecessary file descriptors and handles from the parent process
+ will not be inherited. Starting a process using this method is
+ rather slow compared to using *fork* or *forkserver*.
+
+ Available on Unix and Windows. The default on Windows.
+
+ *fork*
+ The parent process uses :func:`os.fork` to fork the Python
+ interpreter. The child process, when it begins, is effectively
+ identical to the parent process. All resources of the parent are
+ inherited by the child process. Note that safely forking a
+ multithreaded process is problematic.
+
+ Available on Unix only. The default on Unix.
+
+ *forkserver*
+ When the program starts and selects the *forkserver* start method,
+ a server process is started. From then on, whenever a new process
+ is need the parent process connects to the server and requests
+ that it fork a new process. The fork server process is single
+ threaded so it is safe for it to use :func:`os.fork`. No
+ unnecessary resources are inherited.
+
+ Available on Unix platforms which support passing file descriptors
+ over unix pipes.
+
+Before Python 3.4 *fork* was the only option available on Unix. Also,
+prior to Python 3.4, child processes would inherit all the parents
+inheritable handles on Windows.
+
+On Unix using the *spawn* or *forkserver* start methods will also
+start a *semaphore tracker* process which tracks the unlinked named
+semaphores created by processes of the program. When all processes
+have exited the semaphore tracker unlinks any remaining semaphores.
+Usually there should be none, but if a process was killed by a signal
+there may some "leaked" semaphores. (Unlinking the named semaphores
+is a serious matter since the system allows only a limited number, and
+they will not be automatically unlinked until the next reboot.)
+
+To select the a start method you use the :func:`set_start_method` in
+the ``if __name__ == '__main__'`` clause of the main module. For
+example::
+
+ import multiprocessing as mp
+
+ def foo():
+ print('hello')
+
+ if __name__ == '__main__':
+ mp.set_start_method('spawn')
+ p = mp.Process(target=foo)
+ p.start()
+ p.join()
+
+:func:`set_start_method` should not be used more than once in the
+program.
+
+
+
Exchanging objects between processes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -274,15 +343,31 @@ processes in a few different ways.
For example::
from multiprocessing import Pool
+ from time import sleep
def f(x):
return x*x
if __name__ == '__main__':
- with Pool(processes=4) as pool: # start 4 worker processes
- result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
- print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow
- print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]"
+ # start 4 worker processes
+ with Pool(processes=4) as pool:
+
+ # print "[0, 1, 4,..., 81]"
+ print(pool.map(f, range(10)))
+
+ # print same numbers in arbitrary order
+ for i in pool.imap_unordered(f, range(10)):
+ print(i)
+
+ # evaluate "f(10)" asynchronously
+ res = pool.apply_async(f, [10])
+ print(res.get(timeout=1)) # prints "100"
+
+ # make worker sleep for 10 secs
+ res = pool.apply_async(sleep, 10)
+ print(res.get(timeout=1)) # raises multiprocessing.TimeoutError
+
+ # exiting the 'with'-block has stopped the pool
Note that the methods of a pool should only ever be used by the
process which created it.
@@ -763,6 +848,24 @@ Miscellaneous
If the module is being run normally by the Python interpreter then
:func:`freeze_support` has no effect.
+.. function:: get_all_start_methods()
+
+ Returns a list of the supported start methods, the first of which
+ is the default. The possible start methods are ``'fork'``,
+ ``'spawn'`` and ``'forkserver'``. On Windows only ``'spawn'`` is
+ available. On Unix ``'fork'`` and ``'spawn'`` are always
+ supported, with ``'fork'`` being the default.
+
+ .. versionadded:: 3.4
+
+.. function:: get_start_method()
+
+ Return the current start method. This can be ``'fork'``,
+ ``'spawn'`` or ``'forkserver'``. ``'fork'`` is the default on
+ Unix, while ``'spawn'`` is the default on Windows.
+
+ .. versionadded:: 3.4
+
.. function:: set_executable()
Sets the path of the Python interpreter to use when starting a child process.
@@ -771,8 +874,21 @@ Miscellaneous
set_executable(os.path.join(sys.exec_prefix, 'pythonw.exe'))
- before they can create child processes. (Windows only)
+ before they can create child processes.
+
+ .. versionchanged:: 3.4
+ Now supported on Unix when the ``'spawn'`` start method is used.
+
+.. function:: set_start_method(method)
+
+ Set the method which should be used to start child processes.
+ *method* can be ``'fork'``, ``'spawn'`` or ``'forkserver'``.
+
+ Note that this should be called at most once, and it should be
+ protected inside the ``if __name__ == '__main__'`` clause of the
+ main module.
+ .. versionadded:: 3.4
.. note::
@@ -2175,43 +2291,8 @@ Below is an example session with logging turned on::
[INFO/MainProcess] sending shutdown message to manager
[INFO/SyncManager-...] manager exiting with exitcode 0
-In addition to having these two logging functions, the multiprocessing also
-exposes two additional logging level attributes. These are :const:`SUBWARNING`
-and :const:`SUBDEBUG`. The table below illustrates where theses fit in the
-normal level hierarchy.
-
-+----------------+----------------+
-| Level | Numeric value |
-+================+================+
-| ``SUBWARNING`` | 25 |
-+----------------+----------------+
-| ``SUBDEBUG`` | 5 |
-+----------------+----------------+
-
For a full table of logging levels, see the :mod:`logging` module.
-These additional logging levels are used primarily for certain debug messages
-within the multiprocessing module. Below is the same example as above, except
-with :const:`SUBDEBUG` enabled::
-
- >>> import multiprocessing, logging
- >>> logger = multiprocessing.log_to_stderr()
- >>> logger.setLevel(multiprocessing.SUBDEBUG)
- >>> logger.warning('doomed')
- [WARNING/MainProcess] doomed
- >>> m = multiprocessing.Manager()
- [INFO/SyncManager-...] child process calling self.run()
- [INFO/SyncManager-...] created temp directory /.../pymp-...
- [INFO/SyncManager-...] manager serving at '/.../pymp-djGBXN/listener-...'
- >>> del m
- [SUBDEBUG/MainProcess] finalizer calling ...
- [INFO/MainProcess] sending shutdown message to manager
- [DEBUG/SyncManager-...] manager received shutdown message
- [SUBDEBUG/SyncManager-...] calling <Finalize object, callback=unlink, ...
- [SUBDEBUG/SyncManager-...] finalizer calling <built-in function unlink> ...
- [SUBDEBUG/SyncManager-...] calling <Finalize object, dead>
- [SUBDEBUG/SyncManager-...] finalizer calling <function rmtree at 0x5aa730> ...
- [INFO/SyncManager-...] manager exiting with exitcode 0
The :mod:`multiprocessing.dummy` module
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -2232,8 +2313,10 @@ There are certain guidelines and idioms which should be adhered to when using
:mod:`multiprocessing`.
-All platforms
-~~~~~~~~~~~~~
+All start methods
+~~~~~~~~~~~~~~~~~
+
+The following applies to all start methods.
Avoid shared state
@@ -2266,11 +2349,13 @@ Joining zombie processes
Better to inherit than pickle/unpickle
- On Windows many types from :mod:`multiprocessing` need to be picklable so
- that child processes can use them. However, one should generally avoid
- sending shared objects to other processes using pipes or queues. Instead
- you should arrange the program so that a process which needs access to a
- shared resource created elsewhere can inherit it from an ancestor process.
+ When using the *spawn* or *forkserver* start methods many types
+ from :mod:`multiprocessing` need to be picklable so that child
+ processes can use them. However, one should generally avoid
+ sending shared objects to other processes using pipes or queues.
+ Instead you should arrange the program so that a process which
+ needs access to a shared resource created elsewhere can inherit it
+ from an ancestor process.
Avoid terminating processes
@@ -2314,15 +2399,17 @@ Joining processes that use queues
Explicitly pass resources to child processes
- On Unix a child process can make use of a shared resource created in a
- parent process using a global resource. However, it is better to pass the
- object as an argument to the constructor for the child process.
+ On Unix using the *fork* start method, a child process can make
+ use of a shared resource created in a parent process using a
+ global resource. However, it is better to pass the object as an
+ argument to the constructor for the child process.
- Apart from making the code (potentially) compatible with Windows this also
- ensures that as long as the child process is still alive the object will not
- be garbage collected in the parent process. This might be important if some
- resource is freed when the object is garbage collected in the parent
- process.
+ Apart from making the code (potentially) compatible with Windows
+ and the other start methods this also ensures that as long as the
+ child process is still alive the object will not be garbage
+ collected in the parent process. This might be important if some
+ resource is freed when the object is garbage collected in the
+ parent process.
So for instance ::
@@ -2381,17 +2468,19 @@ Beware of replacing :data:`sys.stdin` with a "file like object"
For more information, see :issue:`5155`, :issue:`5313` and :issue:`5331`
-Windows
-~~~~~~~
+The *spawn* and *forkserver* start methods
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Since Windows lacks :func:`os.fork` it has a few extra restrictions:
+There are a few extra restriction which don't apply to the *fork*
+start method.
More picklability
- Ensure that all arguments to :meth:`Process.__init__` are picklable. This
- means, in particular, that bound or unbound methods cannot be used directly
- as the ``target`` argument on Windows --- just define a function and use
- that instead.
+ Ensure that all arguments to :meth:`Process.__init__` are
+ picklable. This means, in particular, that bound or unbound
+ methods cannot be used directly as the ``target`` (unless you use
+ the *fork* start method) --- just define a function and use that
+ instead.
Also, if you subclass :class:`Process` then make sure that instances will be
picklable when the :meth:`Process.start` method is called.
@@ -2411,7 +2500,8 @@ Safe importing of main module
interpreter without causing unintended side effects (such a starting a new
process).
- For example, under Windows running the following module would fail with a
+ For example, using the *spawn* or *forkserver* start method
+ running the following module would fail with a
:exc:`RuntimeError`::
from multiprocessing import Process
@@ -2425,13 +2515,14 @@ Safe importing of main module
Instead one should protect the "entry point" of the program by using ``if
__name__ == '__main__':`` as follows::
- from multiprocessing import Process, freeze_support
+ from multiprocessing import Process, freeze_support, set_start_method
def foo():
print('hello')
if __name__ == '__main__':
freeze_support()
+ set_start_method('spawn')
p = Process(target=foo)
p.start()
@@ -2462,26 +2553,7 @@ Using :class:`Pool`:
:language: python3
-Synchronization types like locks, conditions and queues:
-
-.. literalinclude:: ../includes/mp_synchronize.py
- :language: python3
-
-
An example showing how to use queues to feed tasks to a collection of worker
processes and collect the results:
.. literalinclude:: ../includes/mp_workers.py
-
-
-An example of how a pool of worker processes can each run a
-:class:`~http.server.SimpleHTTPRequestHandler` instance while sharing a single
-listening socket.
-
-.. literalinclude:: ../includes/mp_webserver.py
-
-
-Some simple benchmarks comparing :mod:`multiprocessing` with :mod:`threading`:
-
-.. literalinclude:: ../includes/mp_benchmarks.py
-