From c29248f2324188a192d11a0e8adf9e146b326bd4 Mon Sep 17 00:00:00 2001 From: Nick Coghlan Date: Tue, 8 Nov 2011 20:49:23 +1000 Subject: Issue #13237: Forward port subprocess module updates and explicitly document UTF-8 encoding assumption when universal_newlines=True --- Doc/library/subprocess.rst | 440 +++++++++++++++++++++++++++++++-------------- Misc/NEWS | 3 + 2 files changed, 306 insertions(+), 137 deletions(-) diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst index 16c2605..afa7adc 100644 --- a/Doc/library/subprocess.rst +++ b/Doc/library/subprocess.rst @@ -25,7 +25,227 @@ modules and functions can be found in the following sections. Using the subprocess Module --------------------------- -This module defines one class called :class:`Popen`: +The recommended approach to invoking subprocesses is to use the following +convenience functions for all use cases they can handle. For more advanced +use cases, the underlying :class:`Popen` interface can be used directly. + + +.. function:: call(args, *, stdin=None, stdout=None, stderr=None, shell=False) + + Run the command described by *args*. Wait for command to complete, then + return the :attr:`returncode` attribute. + + The arguments shown above are merely the most common ones, described below + in :ref:`frequently-used-arguments` (hence the slightly odd notation in + the abbreviated signature). The full function signature is the same as + that of the :class:`Popen` constructor - this functions passes all + supplied arguments directly through to that interface. + + Examples:: + + >>> subprocess.call(["ls", "-l"]) + 0 + + >>> subprocess.call("exit 1", shell=True) + 1 + + .. warning:: + + Invoking the system shell with ``shell=True`` can be a security hazard + if combined with untrusted input. See the warning under + :ref:`frequently-used-arguments` for details. + + .. note:: + + Do not use ``stdout=PIPE`` or ``stderr=PIPE`` with this function. As + the pipes are not being read in the current process, the child + process may block if it generates enough output to a pipe to fill up + the OS pipe buffer. + + +.. function:: check_call(args, *, stdin=None, stdout=None, stderr=None, shell=False) + + Run command with arguments. Wait for command to complete. If the return + code was zero then return, otherwise raise :exc:`CalledProcessError`. The + :exc:`CalledProcessError` object will have the return code in the + :attr:`returncode` attribute. + + The arguments shown above are merely the most common ones, described below + in :ref:`frequently-used-arguments` (hence the slightly odd notation in + the abbreviated signature). The full function signature is the same as + that of the :class:`Popen` constructor - this functions passes all + supplied arguments directly through to that interface. + + Examples:: + + >>> subprocess.check_call(["ls", "-l"]) + 0 + + >>> subprocess.check_call("exit 1", shell=True) + Traceback (most recent call last): + ... + subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1 + + .. versionadded:: 2.5 + + .. warning:: + + Invoking the system shell with ``shell=True`` can be a security hazard + if combined with untrusted input. See the warning under + :ref:`frequently-used-arguments` for details. + + .. note:: + + Do not use ``stdout=PIPE`` or ``stderr=PIPE`` with this function. As + the pipes are not being read in the current process, the child + process may block if it generates enough output to a pipe to fill up + the OS pipe buffer. + + +.. function:: check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False) + + Run command with arguments and return its output as a byte string. + + If the return code was non-zero it raises a :exc:`CalledProcessError`. The + :exc:`CalledProcessError` object will have the return code in the + :attr:`returncode` attribute and any output in the :attr:`output` + attribute. + + The arguments shown above are merely the most common ones, described below + in :ref:`frequently-used-arguments` (hence the slightly odd notation in + the abbreviated signature). The full function signature is largely the + same as that of the :class:`Popen` constructor, except that *stdout* is + not permitted as it is used internally. All other supplied arguments are + passed directly through to the :class:`Popen` constructor. + + Examples:: + + >>> subprocess.check_output(["echo", "Hello World!"]) + b'Hello World!\n' + + >>> subprocess.check_output(["echo", "Hello World!"], universal_newlines=True) + 'Hello World!\n' + + >>> subprocess.check_output("exit 1", shell=True) + Traceback (most recent call last): + ... + subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1 + + By default, this function will return the data as encoded bytes. The actual + encoding of the output data may depend on the command being invoked, so the + decoding to text will often need to be handled at the application level. + + This behaviour may be overridden by setting *universal_newlines* to + :const:`True` as described below in :ref:`frequently-used-arguments`. + + To also capture standard error in the result, use + ``stderr=subprocess.STDOUT``:: + + >>> subprocess.check_output( + ... "ls non_existent_file; exit 0", + ... stderr=subprocess.STDOUT, + ... shell=True) + 'ls: non_existent_file: No such file or directory\n' + + .. versionadded:: 2.7 + + .. warning:: + + Invoking the system shell with ``shell=True`` can be a security hazard + if combined with untrusted input. See the warning under + :ref:`frequently-used-arguments` for details. + + .. note:: + + Do not use ``stderr=PIPE`` with this function. As the pipe is not being + read in the current process, the child process may block if it + generates enough output to the pipe to fill up the OS pipe buffer. + + +.. data:: PIPE + + Special value that can be used as the *stdin*, *stdout* or *stderr* argument + to :class:`Popen` and indicates that a pipe to the standard stream should be + opened. + + +.. data:: STDOUT + + Special value that can be used as the *stderr* argument to :class:`Popen` and + indicates that standard error should go into the same handle as standard + output. + + +.. _frequently-used-arguments: + +Frequently Used Arguments +^^^^^^^^^^^^^^^^^^^^^^^^^ + +To support a wide variety of use cases, the :class:`Popen` constructor (and +the convenience functions) accept a large number of optional arguments. For +most typical use cases, many of these arguments can be safely left at their +default values. The arguments that are most commonly needed are: + + *args* is required for all calls and should be a string, or a sequence of + program arguments. Providing a sequence of arguments is generally + preferred, as it allows the module to take care of any required escaping + and quoting of arguments (e.g. to permit spaces in file names). If passing + a single string, either *shell* must be :const:`True` (see below) or else + the string must simply name the program to be executed without specifying + any arguments. + + *stdin*, *stdout* and *stderr* specify the executed program's standard input, + standard output and standard error file handles, respectively. Valid values + are :data:`PIPE`, an existing file descriptor (a positive integer), an + existing file object, and ``None``. :data:`PIPE` indicates that a new pipe + to the child should be created. With the default settings of ``None``, no + redirection will occur; the child's file handles will be inherited from the + parent. Additionally, *stderr* can be :data:`STDOUT`, which indicates that + the stderr data from the child process should be captured into the same file + handle as for stdout. + + When *stdout* or *stderr* are pipes and *universal_newlines* is + :const:`True` then the output data is assumed to be encoded as UTF-8 and + will automatically be decoded to text. All line endings will be converted + to ``'\n'`` as described for the universal newlines `'U'`` mode argument + to :func:`open`. + + If *shell* is :const:`True`, the specified command will be executed through + the shell. This can be useful if you are using Python primarily for the + enhanced control flow it offers over most system shells and still want + access to other shell features such as filename wildcards, shell pipes and + environment variable expansion. + + .. warning:: + + Executing shell commands that incorporate unsanitized input from an + untrusted source makes a program vulnerable to `shell injection + `_, + a serious security flaw which can result in arbitrary command execution. + For this reason, the use of *shell=True* is **strongly discouraged** in cases + where the command string is constructed from external input:: + + >>> from subprocess import call + >>> filename = input("What file would you like to display?\n") + What file would you like to display? + non_existent; rm -rf / # + >>> call("cat " + filename, shell=True) # Uh-oh. This will end badly... + + ``shell=False`` disables all shell based features, but does not suffer + from this vulnerability; see the Note in the :class:`Popen` constructor + documentation for helpful hints in getting ``shell=False`` to work. + +These options, along with all of the other options, are described in more +detail in the :class:`Popen` constructor documentation. + + +Popen Constuctor +^^^^^^^^^^^^^^^^ + +The underlying process creation and management in this module is handled by +the :class:`Popen` class. It offers a lot of flexibility so that developers +are able to handle the less common cases not covered by the convenience +functions. .. class:: Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=True, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0, restore_signals=True, start_new_session=False, pass_fds=()) @@ -78,21 +298,9 @@ This module defines one class called :class:`Popen`: .. warning:: - Executing shell commands that incorporate unsanitized input from an - untrusted source makes a program vulnerable to `shell injection - `_, - a serious security flaw which can result in arbitrary command execution. - For this reason, the use of *shell=True* is **strongly discouraged** in cases - where the command string is constructed from external input:: - - >>> from subprocess import call - >>> filename = input("What file would you like to display?\n") - What file would you like to display? - non_existent; rm -rf / # - >>> call("cat " + filename, shell=True) # Uh-oh. This will end badly... - - *shell=False* does not suffer from this vulnerability; the above Note may be - helpful in getting code using *shell=False* to work. + Enabling this option can be a security hazard if combined with untrusted + input. See the warning under :ref:`frequently-used-arguments` + for details. On Windows: the :class:`Popen` class uses CreateProcess() to execute the child program, which operates on strings. If *args* is a sequence, it will @@ -121,14 +329,15 @@ This module defines one class called :class:`Popen`: You don't need ``shell=True`` to run a batch file, nor to run a console-based executable. - *stdin*, *stdout* and *stderr* specify the executed programs' standard input, + *stdin*, *stdout* and *stderr* specify the executed program's standard input, standard output and standard error file handles, respectively. Valid values are :data:`PIPE`, an existing file descriptor (a positive integer), an existing :term:`file object`, and ``None``. :data:`PIPE` indicates that a - new pipe to the child should be created. With ``None``, no redirection will - occur; the child's file handles will be inherited from the parent. Additionally, - *stderr* can be :data:`STDOUT`, which indicates that the stderr data from the - applications should be captured into the same file handle as for stdout. + new pipe to the child should be created. With the default settings of + ``None``, no redirection will occur; the child's file handles will be + inherited from the parent. Additionally, *stderr* can be :data:`STDOUT`, + which indicates that the stderr data from the applications should be + captured into the same file handle as for stdout. If *preexec_fn* is set to a callable object, this object will be called in the child process just before the child is executed. @@ -242,104 +451,6 @@ This module defines one class called :class:`Popen`: output. -Convenience Functions -^^^^^^^^^^^^^^^^^^^^^ - -This module also defines the following shortcut functions: - - -.. function:: call(*popenargs, **kwargs) - - Run command with arguments. Wait for command to complete, then return the - :attr:`returncode` attribute. - - The arguments are the same as for the :class:`Popen` constructor. Example:: - - >>> retcode = subprocess.call(["ls", "-l"]) - - .. warning:: - - Like :meth:`Popen.wait`, this will deadlock when using - ``stdout=PIPE`` and/or ``stderr=PIPE`` and the child process - generates enough output to a pipe such that it blocks waiting - for the OS pipe buffer to accept more data. - - -.. function:: check_call(*popenargs, **kwargs) - - Run command with arguments. Wait for command to complete. If the exit code was - zero then return, otherwise raise :exc:`CalledProcessError`. The - :exc:`CalledProcessError` object will have the return code in the - :attr:`returncode` attribute. - - The arguments are the same as for the :class:`Popen` constructor. Example:: - - >>> subprocess.check_call(["ls", "-l"]) - 0 - - .. warning:: - - See the warning for :func:`call`. - - -.. function:: check_output(*popenargs, **kwargs) - - Run command with arguments and return its output as a byte string. - - If the exit code was non-zero it raises a :exc:`CalledProcessError`. The - :exc:`CalledProcessError` object will have the return code in the - :attr:`returncode` - attribute and output in the :attr:`output` attribute. - - The arguments are the same as for the :class:`Popen` constructor. Example:: - - >>> subprocess.check_output(["ls", "-l", "/dev/null"]) - b'crw-rw-rw- 1 root root 1, 3 Oct 18 2007 /dev/null\n' - - The stdout argument is not allowed as it is used internally. - To capture standard error in the result, use ``stderr=subprocess.STDOUT``:: - - >>> subprocess.check_output( - ... ["/bin/sh", "-c", "ls non_existent_file; exit 0"], - ... stderr=subprocess.STDOUT) - b'ls: non_existent_file: No such file or directory\n' - - .. versionadded:: 3.1 - - -.. function:: getstatusoutput(cmd) - - Return ``(status, output)`` of executing *cmd* in a shell. - - Execute the string *cmd* in a shell with :func:`os.popen` and return a 2-tuple - ``(status, output)``. *cmd* is actually run as ``{ cmd ; } 2>&1``, so that the - returned output will contain output or error messages. A trailing newline is - stripped from the output. The exit status for the command can be interpreted - according to the rules for the C function :c:func:`wait`. Example:: - - >>> subprocess.getstatusoutput('ls /bin/ls') - (0, '/bin/ls') - >>> subprocess.getstatusoutput('cat /bin/junk') - (256, 'cat: /bin/junk: No such file or directory') - >>> subprocess.getstatusoutput('/bin/junk') - (256, 'sh: /bin/junk: not found') - - Availability: UNIX. - - -.. function:: getoutput(cmd) - - Return output (stdout and stderr) of executing *cmd* in a shell. - - Like :func:`getstatusoutput`, except the exit status is ignored and the return - value is a string containing the command's output. Example:: - - >>> subprocess.getoutput('ls /bin/ls') - '/bin/ls' - - Availability: UNIX. - - Exceptions ^^^^^^^^^^ @@ -355,16 +466,19 @@ when trying to execute a non-existent file. Applications should prepare for A :exc:`ValueError` will be raised if :class:`Popen` is called with invalid arguments. -check_call() will raise :exc:`CalledProcessError`, if the called process returns -a non-zero return code. +:func:`check_call` and :func:`check_output` will raise +:exc:`CalledProcessError` if the called process returns a non-zero return +code. Security ^^^^^^^^ -Unlike some other popen functions, this implementation will never call /bin/sh -implicitly. This means that all characters, including shell metacharacters, can -safely be passed to child processes. +Unlike some other popen functions, this implementation will never call a +system shell implicitly. This means that all characters, including shell +metacharacters, can safely be passed to child processes. Obviously, if the +shell is invoked explicitly, then it is the application's responsibility to +ensure that all whitespace and metacharacters are quoted appropriately. Popen Objects @@ -592,15 +706,21 @@ The :mod:`subprocess` module exposes the following constants. Replacing Older Functions with the subprocess Module ---------------------------------------------------- -In this section, "a ==> b" means that b can be used as a replacement for a. +In this section, "a becomes b" means that b can be used as a replacement for a. .. note:: - All functions in this section fail (more or less) silently if the executed - program cannot be found; this module raises an :exc:`OSError` exception. + All "a" functions in this section fail (more or less) silently if the + executed program cannot be found; the "b" replacements raise :exc:`OSError` + instead. -In the following examples, we assume that the subprocess module is imported with -"from subprocess import \*". + In addition, the replacements using :func:`check_output` will fail with a + :exc:`CalledProcessError` if the requested operation produces a non-zero + return code. The output is still available as the ``output`` attribute of + the raised exception. + +In the following examples, we assume that the relevant functions have already +been imported from the subprocess module. Replacing /bin/sh shell backquote @@ -609,8 +729,8 @@ Replacing /bin/sh shell backquote :: output=`mycmd myarg` - ==> - output = Popen(["mycmd", "myarg"], stdout=PIPE).communicate()[0] + # becomes + output = check_output(["mycmd", "myarg"]) Replacing shell pipeline @@ -619,7 +739,7 @@ Replacing shell pipeline :: output=`dmesg | grep hda` - ==> + # becomes p1 = Popen(["dmesg"], stdout=PIPE) p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE) p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits. @@ -628,22 +748,27 @@ Replacing shell pipeline The p1.stdout.close() call after starting the p2 is important in order for p1 to receive a SIGPIPE if p2 exits before p1. +Alternatively, for trusted input, the shell's own pipeline support may still +be used directly: + + output=`dmesg | grep hda` + # becomes + output=check_output("dmesg | grep hda", shell=True) + + Replacing :func:`os.system` ^^^^^^^^^^^^^^^^^^^^^^^^^^^ :: sts = os.system("mycmd" + " myarg") - ==> - p = Popen("mycmd" + " myarg", shell=True) - sts = os.waitpid(p.pid, 0)[1] + # becomes + sts = call("mycmd" + " myarg", shell=True) Notes: * Calling the program through the shell is usually not required. -* It's easier to look at the :attr:`returncode` attribute than the exit status. - A more realistic example would look like this:: try: @@ -768,6 +893,7 @@ Replacing functions from the :mod:`popen2` module ``close_fds=True`` with :class:`Popen` to guarantee this behavior on all platforms or past Python versions. + Notes ----- @@ -801,3 +927,43 @@ runtime): described in rule 3. +Legacy Shell Invocation Functions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This module also provides the following legacy functions from the 2.x +``commands`` module. These operations implicitly invoke the system shell and +none of the guarantees described above regarding security and exception +handling consistency are valid for these functions. + +.. function:: getstatusoutput(cmd) + + Return ``(status, output)`` of executing *cmd* in a shell. + + Execute the string *cmd* in a shell with :func:`os.popen` and return a 2-tuple + ``(status, output)``. *cmd* is actually run as ``{ cmd ; } 2>&1``, so that the + returned output will contain output or error messages. A trailing newline is + stripped from the output. The exit status for the command can be interpreted + according to the rules for the C function :c:func:`wait`. Example:: + + >>> subprocess.getstatusoutput('ls /bin/ls') + (0, '/bin/ls') + >>> subprocess.getstatusoutput('cat /bin/junk') + (256, 'cat: /bin/junk: No such file or directory') + >>> subprocess.getstatusoutput('/bin/junk') + (256, 'sh: /bin/junk: not found') + + Availability: UNIX. + + +.. function:: getoutput(cmd) + + Return output (stdout and stderr) of executing *cmd* in a shell. + + Like :func:`getstatusoutput`, except the exit status is ignored and the return + value is a string containing the command's output. Example:: + + >>> subprocess.getoutput('ls /bin/ls') + '/bin/ls' + + Availability: UNIX. + diff --git a/Misc/NEWS b/Misc/NEWS index 2923982..0d5f981 100644 --- a/Misc/NEWS +++ b/Misc/NEWS @@ -259,6 +259,9 @@ Extension Modules Documentation ------------- +- Issue #13237: Reorganise subprocess documentation to emphasise convenience + functions and the most commonly needed arguments to Popen. + - Issue #13141: Demonstrate recommended style for socketserver examples. -- cgit v0.12 From 69ce0d86029041ffd04438a321248cfa8158b9a8 Mon Sep 17 00:00:00 2001 From: Nick Coghlan Date: Tue, 8 Nov 2011 21:39:07 +1000 Subject: Issue #13237: Remove duplicate data value descriptions from the subprocess docs --- Doc/library/subprocess.rst | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst index afa7adc..311dfd2 100644 --- a/Doc/library/subprocess.rst +++ b/Doc/library/subprocess.rst @@ -437,20 +437,6 @@ functions. Added context manager support. -.. data:: PIPE - - Special value that can be used as the *stdin*, *stdout* or *stderr* argument - to :class:`Popen` and indicates that a pipe to the standard stream should be - opened. - - -.. data:: STDOUT - - Special value that can be used as the *stderr* argument to :class:`Popen` and - indicates that standard error should go into the same handle as standard - output. - - Exceptions ^^^^^^^^^^ -- cgit v0.12 From 32e4a58c0656d1f3a6415cd099bf1e531bc37c01 Mon Sep 17 00:00:00 2001 From: Nick Coghlan Date: Tue, 8 Nov 2011 21:50:58 +1000 Subject: Issue #13237: Fix formatting error - the legacy shell commands weren't meant to be under the Notes heading --- Doc/library/subprocess.rst | 67 +++++++++++++++++++++++----------------------- 1 file changed, 33 insertions(+), 34 deletions(-) diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst index 311dfd2..19a29cb 100644 --- a/Doc/library/subprocess.rst +++ b/Doc/library/subprocess.rst @@ -880,41 +880,8 @@ Replacing functions from the :mod:`popen2` module all platforms or past Python versions. -Notes ------ - -.. _converting-argument-sequence: - -Converting an argument sequence to a string on Windows -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -On Windows, an *args* sequence is converted to a string that can be parsed -using the following rules (which correspond to the rules used by the MS C -runtime): - -1. Arguments are delimited by white space, which is either a - space or a tab. - -2. A string surrounded by double quotation marks is - interpreted as a single argument, regardless of white space - contained within. A quoted string can be embedded in an - argument. - -3. A double quotation mark preceded by a backslash is - interpreted as a literal double quotation mark. - -4. Backslashes are interpreted literally, unless they - immediately precede a double quotation mark. - -5. If backslashes immediately precede a double quotation mark, - every pair of backslashes is interpreted as a literal - backslash. If the number of backslashes is odd, the last - backslash escapes the next double quotation mark as - described in rule 3. - - Legacy Shell Invocation Functions -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +--------------------------------- This module also provides the following legacy functions from the 2.x ``commands`` module. These operations implicitly invoke the system shell and @@ -953,3 +920,35 @@ handling consistency are valid for these functions. Availability: UNIX. + +Notes +----- + +.. _converting-argument-sequence: + +Converting an argument sequence to a string on Windows +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +On Windows, an *args* sequence is converted to a string that can be parsed +using the following rules (which correspond to the rules used by the MS C +runtime): + +1. Arguments are delimited by white space, which is either a + space or a tab. + +2. A string surrounded by double quotation marks is + interpreted as a single argument, regardless of white space + contained within. A quoted string can be embedded in an + argument. + +3. A double quotation mark preceded by a backslash is + interpreted as a literal double quotation mark. + +4. Backslashes are interpreted literally, unless they + immediately precede a double quotation mark. + +5. If backslashes immediately precede a double quotation mark, + every pair of backslashes is interpreted as a literal + backslash. If the number of backslashes is odd, the last + backslash escapes the next double quotation mark as + described in rule 3. -- cgit v0.12