diff options
author | Victor Stinner <vstinner@python.org> | 2020-11-01 22:07:23 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-11-01 22:07:23 (GMT) |
commit | e662c398d87f136497f8ec672e83657ae3a599e0 (patch) | |
tree | cc9383c30557769a096be580b7f8f1b936565ea9 /Doc/library/sys.rst | |
parent | 82458b6cdbae3b849dc11d0d7dc2ab06ef0451c4 (diff) | |
download | cpython-e662c398d87f136497f8ec672e83657ae3a599e0.zip cpython-e662c398d87f136497f8ec672e83657ae3a599e0.tar.gz cpython-e662c398d87f136497f8ec672e83657ae3a599e0.tar.bz2 |
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.
In May 2010 (commit b744ba1d14c5487576c95d0311e357b707600b47), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.
In August 2020 (commit 94908bbc1503df830d1d615e7b57744ae1b41079), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.
In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.
Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.
While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)
Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
Diffstat (limited to 'Doc/library/sys.rst')
-rw-r--r-- | Doc/library/sys.rst | 31 |
1 files changed, 14 insertions, 17 deletions
diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst index 468a30d..2f0840e 100644 --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -616,29 +616,20 @@ always available. .. function:: getfilesystemencoding() Return the name of the encoding used to convert between Unicode - filenames and bytes filenames. For best compatibility, str should be - used for filenames in all cases, although representing filenames as bytes - is also supported. Functions accepting or returning filenames should support - either str or bytes and internally convert to the system's preferred - representation. + filenames and bytes filenames. + + For best compatibility, str should be used for filenames in all cases, + although representing filenames as bytes is also supported. Functions + accepting or returning filenames should support either str or bytes and + internally convert to the system's preferred representation. This encoding is always ASCII-compatible. :func:`os.fsencode` and :func:`os.fsdecode` should be used to ensure that the correct encoding and errors mode are used. - * In the UTF-8 mode, the encoding is ``utf-8`` on any platform. - - * On macOS, the encoding is ``'utf-8'``. - - * On Unix, the encoding is the locale encoding. - - * On Windows, the encoding may be ``'utf-8'`` or ``'mbcs'``, depending - on user configuration. - - * On Android, the encoding is ``'utf-8'``. - - * On VxWorks, the encoding is ``'utf-8'``. + The filesystem encoding is initialized from + :c:member:`PyConfig.filesystem_encoding`. .. versionchanged:: 3.2 :func:`getfilesystemencoding` result cannot be ``None`` anymore. @@ -660,6 +651,9 @@ always available. :func:`os.fsencode` and :func:`os.fsdecode` should be used to ensure that the correct encoding and errors mode are used. + The filesystem error handler is initialized from + :c:member:`PyConfig.filesystem_errors`. + .. versionadded:: 3.6 .. function:: getrefcount(object) @@ -1457,6 +1451,9 @@ always available. This is equivalent to defining the :envvar:`PYTHONLEGACYWINDOWSFSENCODING` environment variable before launching Python. + See also :func:`sys.getfilesystemencoding` and + :func:`sys.getfilesystemencodeerrors`. + .. availability:: Windows. .. versionadded:: 3.6 |