summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorGregory P. Smith <greg@krypto.org>2023-09-01 08:53:06 (GMT)
committerGitHub <noreply@github.com>2023-09-01 08:53:06 (GMT)
commit6ba1234c1c643743be15090ecf8da39137c8cbac (patch)
tree80eac4b6c8c4f3bba2a0c146322d5c7c3f20d481
parentb4784b0c5f1cab48b45df5481edc203e25688d0c (diff)
downloadcpython-6ba1234c1c643743be15090ecf8da39137c8cbac.zip
cpython-6ba1234c1c643743be15090ecf8da39137c8cbac.tar.gz
cpython-6ba1234c1c643743be15090ecf8da39137c8cbac.tar.bz2
[3.11] gh-104372: Drop the GIL around the vfork() call. (#104782) (#104958)
gh-104372: Drop the GIL around the vfork() call. (#104782) On Linux where the `subprocess` module can use the `vfork` syscall for faster spawning, prevent the parent process from blocking other threads by dropping the GIL while it waits for the vfork'ed child process `exec` outcome. This prevents spawning a binary from a slow filesystem from blocking the rest of the application. Fixes #104372. (cherry picked from commit d08679212d9af52dd074cd4a6abb440edb944c9c)
-rw-r--r--Doc/library/subprocess.rst13
-rw-r--r--Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst5
-rw-r--r--Modules/_posixsubprocess.c21
3 files changed, 32 insertions, 7 deletions
diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst
index 6ef3c34..377f693 100644
--- a/Doc/library/subprocess.rst
+++ b/Doc/library/subprocess.rst
@@ -57,10 +57,13 @@ underlying :class:`Popen` interface can be used directly.
and combine both streams into one, use ``stdout=PIPE`` and ``stderr=STDOUT``
instead of *capture_output*.
- The *timeout* argument is passed to :meth:`Popen.communicate`. If the timeout
- expires, the child process will be killed and waited for. The
- :exc:`TimeoutExpired` exception will be re-raised after the child process
- has terminated.
+ A *timeout* may be specified in seconds, it is internally passed on to
+ :meth:`Popen.communicate`. If the timeout expires, the child process will be
+ killed and waited for. The :exc:`TimeoutExpired` exception will be
+ re-raised after the child process has terminated. The initial process
+ creation itself cannot be interrupted on many platform APIs so you are not
+ guaranteed to see a timeout exception until at least after however long
+ process creation takes.
The *input* argument is passed to :meth:`Popen.communicate` and thus to the
subprocess's stdin. If used it must be a byte sequence, or a string if
@@ -736,7 +739,7 @@ arguments.
code.
All of the functions and methods that accept a *timeout* parameter, such as
-:func:`call` and :meth:`Popen.communicate` will raise :exc:`TimeoutExpired` if
+:func:`run` and :meth:`Popen.communicate` will raise :exc:`TimeoutExpired` if
the timeout expires before the process exits.
Exceptions defined in this module all inherit from :exc:`SubprocessError`.
diff --git a/Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst b/Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst
new file mode 100644
index 0000000..ea13ec8
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2023-05-22-18-39-53.gh-issue-104372.7tDRaK.rst
@@ -0,0 +1,5 @@
+On Linux where :mod:`subprocess` can use the ``vfork()`` syscall for faster
+spawning, prevent the parent process from blocking other threads by dropping
+the GIL while it waits for the vfork'ed child process ``exec()`` outcome.
+This prevents spawning a binary from a slow filesystem from blocking the
+rest of the application.
diff --git a/Modules/_posixsubprocess.c b/Modules/_posixsubprocess.c
index ad9daae..072519c 100644
--- a/Modules/_posixsubprocess.c
+++ b/Modules/_posixsubprocess.c
@@ -537,7 +537,7 @@ reset_signal_handlers(const sigset_t *child_sigmask)
* required by POSIX but not supported natively on Linux. Another reason to
* avoid this family of functions is that sharing an address space between
* processes running with different privileges is inherently insecure.
- * See bpo-35823 for further discussion and references.
+ * See https://bugs.python.org/issue35823 for discussion and references.
*
* In some C libraries, setrlimit() has the same thread list/signalling
* behavior since resource limits were per-thread attributes before
@@ -774,6 +774,7 @@ do_fork_exec(char *const exec_array[],
pid_t pid;
#ifdef VFORK_USABLE
+ PyThreadState *vfork_tstate_save;
if (child_sigmask) {
/* These are checked by our caller; verify them in debug builds. */
assert(!call_setuid);
@@ -781,8 +782,23 @@ do_fork_exec(char *const exec_array[],
assert(!call_setgroups);
assert(preexec_fn == Py_None);
+ /* Drop the GIL so that other threads can continue execution while this
+ * thread in the parent remains blocked per vfork-semantics on the
+ * child's exec syscall outcome. Exec does filesystem access which
+ * can take an arbitrarily long time. This addresses GH-104372.
+ *
+ * The vfork'ed child still runs in our address space. Per POSIX it
+ * must be limited to nothing but exec, but the Linux implementation
+ * is a little more usable. See the child_exec() comment - The child
+ * MUST NOT re-acquire the GIL.
+ */
+ vfork_tstate_save = PyEval_SaveThread();
pid = vfork();
- if (pid == -1) {
+ if (pid != 0) {
+ // Not in the child process, reacquire the GIL.
+ PyEval_RestoreThread(vfork_tstate_save);
+ }
+ if (pid == (pid_t)-1) {
/* If vfork() fails, fall back to using fork(). When it isn't
* allowed in a process by the kernel, vfork can return -1
* with errno EINVAL. https://bugs.python.org/issue47151. */
@@ -795,6 +811,7 @@ do_fork_exec(char *const exec_array[],
}
if (pid != 0) {
+ // Parent process.
return pid;
}