summaryrefslogtreecommitdiffstats
path: root/Doc/library/shlex.rst
diff options
context:
space:
mode:
authorEvan <evanunderscore@gmail.com>2019-06-01 19:09:22 (GMT)
committerVinay Sajip <vinay_sajip@yahoo.co.uk>2019-06-01 19:09:22 (GMT)
commit56624a99a916fd27152d5b23364303acc0d707de (patch)
tree469ecf27c685101302f1c9c365f394df174e68e9 /Doc/library/shlex.rst
parent2b843ac0ae745026ce39514573c5d075137bef65 (diff)
downloadcpython-56624a99a916fd27152d5b23364303acc0d707de.zip
cpython-56624a99a916fd27152d5b23364303acc0d707de.tar.gz
cpython-56624a99a916fd27152d5b23364303acc0d707de.tar.bz2
bpo-28595: Allow shlex whitespace_split with punctuation_chars (GH-2071)
Diffstat (limited to 'Doc/library/shlex.rst')
-rw-r--r--Doc/library/shlex.rst35
1 files changed, 23 insertions, 12 deletions
diff --git a/Doc/library/shlex.rst b/Doc/library/shlex.rst
index 8c5b023..a8421fd 100644
--- a/Doc/library/shlex.rst
+++ b/Doc/library/shlex.rst
@@ -225,7 +225,8 @@ variables which either control lexical analysis or can be used for debugging:
appear in filename specifications and command line parameters, will also be
included in this attribute, and any characters which appear in
``punctuation_chars`` will be removed from ``wordchars`` if they are present
- there.
+ there. If :attr:`whitespace_split` is set to ``True``, this will have no
+ effect.
.. attribute:: shlex.whitespace
@@ -258,11 +259,13 @@ variables which either control lexical analysis or can be used for debugging:
If ``True``, tokens will only be split in whitespaces. This is useful, for
example, for parsing command lines with :class:`~shlex.shlex`, getting
- tokens in a similar way to shell arguments. If this attribute is ``True``,
- :attr:`punctuation_chars` will have no effect, and splitting will happen
- only on whitespaces. When using :attr:`punctuation_chars`, which is
- intended to provide parsing closer to that implemented by shells, it is
- advisable to leave ``whitespace_split`` as ``False`` (the default value).
+ tokens in a similar way to shell arguments. When used in combination with
+ :attr:`punctuation_chars`, tokens will be split on whitespace in addition to
+ those characters.
+
+ .. versionchanged:: 3.8
+ The :attr:`punctuation_chars` attribute was made compatible with the
+ :attr:`whitespace_split` attribute.
.. attribute:: shlex.infile
@@ -398,12 +401,15 @@ otherwise. To illustrate, you can see the difference in the following snippet:
>>> import shlex
>>> text = "a && b; c && d || e; f >'abc'; (def \"ghi\")"
- >>> list(shlex.shlex(text))
- ['a', '&', '&', 'b', ';', 'c', '&', '&', 'd', '|', '|', 'e', ';', 'f', '>',
- "'abc'", ';', '(', 'def', '"ghi"', ')']
- >>> list(shlex.shlex(text, punctuation_chars=True))
- ['a', '&&', 'b', ';', 'c', '&&', 'd', '||', 'e', ';', 'f', '>', "'abc'",
- ';', '(', 'def', '"ghi"', ')']
+ >>> s = shlex.shlex(text, posix=True)
+ >>> s.whitespace_split = True
+ >>> list(s)
+ ['a', '&&', 'b;', 'c', '&&', 'd', '||', 'e;', 'f', '>abc;', '(def', 'ghi)']
+ >>> s = shlex.shlex(text, posix=True, punctuation_chars=True)
+ >>> s.whitespace_split = True
+ >>> list(s)
+ ['a', '&&', 'b', ';', 'c', '&&', 'd', '||', 'e', ';', 'f', '>', 'abc', ';',
+ '(', 'def', 'ghi', ')']
Of course, tokens will be returned which are not valid for shells, and you'll
need to implement your own error checks on the returned tokens.
@@ -428,6 +434,11 @@ which characters constitute punctuation. For example::
>>> list(s)
['~/a', '&&', 'b-c', '--color=auto', '||', 'd', '*.py?']
+ However, to match the shell as closely as possible, it is recommended to
+ always use ``posix`` and :attr:`~shlex.whitespace_split` when using
+ :attr:`~shlex.punctuation_chars`, which will negate
+ :attr:`~shlex.wordchars` entirely.
+
For best effect, ``punctuation_chars`` should be set in conjunction with
``posix=True``. (Note that ``posix=False`` is the default for
:class:`~shlex.shlex`.)