bpo-29240: Fix locale encodings in UTF-8 Mode (#5170)

Modify locale.localeconv(), time.tzname, os.strerror() and other functions to ignore the UTF-8 Mode: always use the current locale encoding. Changes: * Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or encoding error, they return the position of the error and an error message which are used to raise Unicode errors in PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale(). * Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx(). * PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all cases, especially for the strict error handler. * Add _Py_DecodeUTF8Ex(): return more information on decoding error and supports the strict error handler. * Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex(). * Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx(). * Ignore the UTF-8 mode to encode/decode localeconv(), strerror() and time zone name. * Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use the "current" locale. * Remove _PyUnicode_DecodeCurrentLocale(), _PyUnicode_DecodeCurrentLocaleAndSize() and _PyUnicode_EncodeCurrentLocale().
author: Victor Stinner <victor.stinner@gmail.com> 2018-01-15 09:45:49 (GMT)
committer: GitHub <noreply@github.com> 2018-01-15 09:45:49 (GMT)
commit: 7ed7aead9503102d2ed316175f198104e0cd674c (patch)
tree: 0b70b3b7d2eed5ea92552c1b93953d0333f5a869 /Doc/c-api/sys.rst
parent: ee3b83547c6b0cac1da2cb44aaaea533a1d1bbc8 (diff)
download: cpython-7ed7aead9503102d2ed316175f198104e0cd674c.zip
cpython-7ed7aead9503102d2ed316175f198104e0cd674c.tar.gz
cpython-7ed7aead9503102d2ed316175f198104e0cd674c.tar.bz2
1 files changed, 22 insertions, 0 deletions
diff --git a/Doc/c-api/sys.rst b/Doc/c-api/sys.rst
index 20bc7bd..e4da96c 100644
--- a/Doc/c-api/sys.rst
+++ b/Doc/c-api/sys.rst
@@ -106,6 +106,16 @@ Operating System Utilities
    surrogate character, escape the bytes using the surrogateescape error
    handler instead of decoding them.
 
+   Encoding, highest priority to lowest priority:
+
+   * ``UTF-8`` on macOS and Android;
+   * ``UTF-8`` if the Python UTF-8 mode is enabled;
+   * ``ASCII`` if the ``LC_CTYPE`` locale is ``"C"``,
+     ``nl_langinfo(CODESET)`` returns the ``ASCII`` encoding (or an alias),
+     and :c:func:`mbstowcs` and :c:func:`wcstombs` functions uses the
+     ``ISO-8859-1`` encoding.
+   * the current locale encoding.
+
    Return a pointer to a newly allocated wide character string, use
    :c:func:`PyMem_RawFree` to free the memory. If size is not ``NULL``, write
    the number of wide characters excluding the null character into ``*size``
@@ -137,6 +147,18 @@ Operating System Utilities
    :ref:`surrogateescape error handler <surrogateescape>`: surrogate characters
    in the range U+DC80..U+DCFF are converted to bytes 0x80..0xFF.
 
+   Encoding, highest priority to lowest priority:
+
+   * ``UTF-8`` on macOS and Android;
+   * ``UTF-8`` if the Python UTF-8 mode is enabled;
+   * ``ASCII`` if the ``LC_CTYPE`` locale is ``"C"``,
+     ``nl_langinfo(CODESET)`` returns the ``ASCII`` encoding (or an alias),
+     and :c:func:`mbstowcs` and :c:func:`wcstombs` functions uses the
+     ``ISO-8859-1`` encoding.
+   * the current locale encoding.
+
+   The function uses the UTF-8 encoding in the Python UTF-8 mode.
+
    Return a pointer to a newly allocated byte string, use :c:func:`PyMem_Free`
    to free the memory. Return ``NULL`` on encoding error or memory allocation
    error
author	Victor Stinner <victor.stinner@gmail.com>	2018-01-15 09:45:49 (GMT)
committer	GitHub <noreply@github.com>	2018-01-15 09:45:49 (GMT)
commit	7ed7aead9503102d2ed316175f198104e0cd674c (patch)
tree	0b70b3b7d2eed5ea92552c1b93953d0333f5a869 /Doc/c-api/sys.rst
parent	ee3b83547c6b0cac1da2cb44aaaea533a1d1bbc8 (diff)
download	cpython-7ed7aead9503102d2ed316175f198104e0cd674c.zip cpython-7ed7aead9503102d2ed316175f198104e0cd674c.tar.gz cpython-7ed7aead9503102d2ed316175f198104e0cd674c.tar.bz2