diff options
author | Evan <evanunderscore@gmail.com> | 2019-06-01 19:09:22 (GMT) |
---|---|---|
committer | Vinay Sajip <vinay_sajip@yahoo.co.uk> | 2019-06-01 19:09:22 (GMT) |
commit | 56624a99a916fd27152d5b23364303acc0d707de (patch) | |
tree | 469ecf27c685101302f1c9c365f394df174e68e9 /Doc/library/shlex.rst | |
parent | 2b843ac0ae745026ce39514573c5d075137bef65 (diff) | |
download | cpython-56624a99a916fd27152d5b23364303acc0d707de.zip cpython-56624a99a916fd27152d5b23364303acc0d707de.tar.gz cpython-56624a99a916fd27152d5b23364303acc0d707de.tar.bz2 |
bpo-28595: Allow shlex whitespace_split with punctuation_chars (GH-2071)
Diffstat (limited to 'Doc/library/shlex.rst')
-rw-r--r-- | Doc/library/shlex.rst | 35 |
1 files changed, 23 insertions, 12 deletions
diff --git a/Doc/library/shlex.rst b/Doc/library/shlex.rst index 8c5b023..a8421fd 100644 --- a/Doc/library/shlex.rst +++ b/Doc/library/shlex.rst @@ -225,7 +225,8 @@ variables which either control lexical analysis or can be used for debugging: appear in filename specifications and command line parameters, will also be included in this attribute, and any characters which appear in ``punctuation_chars`` will be removed from ``wordchars`` if they are present - there. + there. If :attr:`whitespace_split` is set to ``True``, this will have no + effect. .. attribute:: shlex.whitespace @@ -258,11 +259,13 @@ variables which either control lexical analysis or can be used for debugging: If ``True``, tokens will only be split in whitespaces. This is useful, for example, for parsing command lines with :class:`~shlex.shlex`, getting - tokens in a similar way to shell arguments. If this attribute is ``True``, - :attr:`punctuation_chars` will have no effect, and splitting will happen - only on whitespaces. When using :attr:`punctuation_chars`, which is - intended to provide parsing closer to that implemented by shells, it is - advisable to leave ``whitespace_split`` as ``False`` (the default value). + tokens in a similar way to shell arguments. When used in combination with + :attr:`punctuation_chars`, tokens will be split on whitespace in addition to + those characters. + + .. versionchanged:: 3.8 + The :attr:`punctuation_chars` attribute was made compatible with the + :attr:`whitespace_split` attribute. .. attribute:: shlex.infile @@ -398,12 +401,15 @@ otherwise. To illustrate, you can see the difference in the following snippet: >>> import shlex >>> text = "a && b; c && d || e; f >'abc'; (def \"ghi\")" - >>> list(shlex.shlex(text)) - ['a', '&', '&', 'b', ';', 'c', '&', '&', 'd', '|', '|', 'e', ';', 'f', '>', - "'abc'", ';', '(', 'def', '"ghi"', ')'] - >>> list(shlex.shlex(text, punctuation_chars=True)) - ['a', '&&', 'b', ';', 'c', '&&', 'd', '||', 'e', ';', 'f', '>', "'abc'", - ';', '(', 'def', '"ghi"', ')'] + >>> s = shlex.shlex(text, posix=True) + >>> s.whitespace_split = True + >>> list(s) + ['a', '&&', 'b;', 'c', '&&', 'd', '||', 'e;', 'f', '>abc;', '(def', 'ghi)'] + >>> s = shlex.shlex(text, posix=True, punctuation_chars=True) + >>> s.whitespace_split = True + >>> list(s) + ['a', '&&', 'b', ';', 'c', '&&', 'd', '||', 'e', ';', 'f', '>', 'abc', ';', + '(', 'def', 'ghi', ')'] Of course, tokens will be returned which are not valid for shells, and you'll need to implement your own error checks on the returned tokens. @@ -428,6 +434,11 @@ which characters constitute punctuation. For example:: >>> list(s) ['~/a', '&&', 'b-c', '--color=auto', '||', 'd', '*.py?'] + However, to match the shell as closely as possible, it is recommended to + always use ``posix`` and :attr:`~shlex.whitespace_split` when using + :attr:`~shlex.punctuation_chars`, which will negate + :attr:`~shlex.wordchars` entirely. + For best effect, ``punctuation_chars`` should be set in conjunction with ``posix=True``. (Note that ``posix=False`` is the default for :class:`~shlex.shlex`.) |