diff options
author | Greg Price <gnprice@gmail.com> | 2019-08-14 11:05:19 (GMT) |
---|---|---|
committer | Victor Stinner <vstinner@redhat.com> | 2019-08-14 11:05:19 (GMT) |
commit | 6bccbe7dfb998af862a183f2c36f0d4603af2c29 (patch) | |
tree | 888ea0d9773dd03c3e2a12f918548df151724186 /Doc/library | |
parent | 077af8c2c93dd71086e2c5e5ff1e634b6da8f214 (diff) | |
download | cpython-6bccbe7dfb998af862a183f2c36f0d4603af2c29.zip cpython-6bccbe7dfb998af862a183f2c36f0d4603af2c29.tar.gz cpython-6bccbe7dfb998af862a183f2c36f0d4603af2c29.tar.bz2 |
bpo-36502: Correct documentation of str.isspace() (GH-15019)
The documented definition was much broader than the real one:
there are tons of characters with general category "Other",
and we don't (and shouldn't) treat most of them as whitespace.
Rewrite the definition to agree with the comment on
_PyUnicode_IsWhitespace, and with the logic in makeunicodedata.py,
which is what generates that function and so ultimately governs.
Add suitable breadcrumbs so that a reader who wants to pin down
exactly what this definition means (what's a "bidirectional class"
of "B"?) can do so. The `unicodedata` module documentation is an
appropriate central place for our references to Unicode's own copious
documentation, so point there.
Also add to the isspace() test a thorough check that the
implementation agrees with the intended definition.
Diffstat (limited to 'Doc/library')
-rw-r--r-- | Doc/library/stdtypes.rst | 10 |
1 files changed, 7 insertions, 3 deletions
diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 9dd557f..08c5ae8 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1763,9 +1763,13 @@ expression support in the :mod:`re` module). .. method:: str.isspace() Return true if there are only whitespace characters in the string and there is - at least one character, false otherwise. Whitespace characters are those - characters defined in the Unicode character database as "Other" or "Separator" - and those with bidirectional property being one of "WS", "B", or "S". + at least one character, false otherwise. + + A character is *whitespace* if in the Unicode character database + (see :mod:`unicodedata`), either its general category is ``Zs`` + ("Separator, space"), or its bidirectional class is one of ``WS``, + ``B``, or ``S``. + .. method:: str.istitle() |