summaryrefslogtreecommitdiffstats
path: root/Doc/library/sys.rst
diff options
context:
space:
mode:
authorVictor Stinner <vstinner@python.org>2020-11-01 22:07:23 (GMT)
committerGitHub <noreply@github.com>2020-11-01 22:07:23 (GMT)
commite662c398d87f136497f8ec672e83657ae3a599e0 (patch)
treecc9383c30557769a096be580b7f8f1b936565ea9 /Doc/library/sys.rst
parent82458b6cdbae3b849dc11d0d7dc2ab06ef0451c4 (diff)
downloadcpython-e662c398d87f136497f8ec672e83657ae3a599e0.zip
cpython-e662c398d87f136497f8ec672e83657ae3a599e0.tar.gz
cpython-e662c398d87f136497f8ec672e83657ae3a599e0.tar.bz2
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python now uses UTF-8 as the filesystem encoding. In May 2010 (commit b744ba1d14c5487576c95d0311e357b707600b47), I modified Python to log a warning and use UTF-8 as the filesystem encoding (instead of None) if nl_langinfo(CODESET) returns an empty string. In August 2020 (commit 94908bbc1503df830d1d615e7b57744ae1b41079), I modified Python startup to fail with a fatal error and a specific error message if nl_langinfo(CODESET) returns an empty string. The intent was to prevent guessing the encoding and also investigate user configuration where this case happens. In 10 years (2010 to 2020), I saw zero user report about the error message related to nl_langinfo(CODESET) returning an empty string. Today, UTF-8 became the defacto standard and it's safe to make the assumption that the user expects UTF-8. For example, nl_langinfo(CODESET) can return an empty string on macOS if the LC_CTYPE locale is not supported, and UTF-8 is the default encoding on macOS. While this change is likely to not affect anyone in practice, it should make UTF-8 lover happy ;-) Rewrite also the documentation explaining how Python selects the filesystem encoding and error handler.
Diffstat (limited to 'Doc/library/sys.rst')
-rw-r--r--Doc/library/sys.rst31
1 files changed, 14 insertions, 17 deletions
diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst
index 468a30d..2f0840e 100644
--- a/Doc/library/sys.rst
+++ b/Doc/library/sys.rst
@@ -616,29 +616,20 @@ always available.
.. function:: getfilesystemencoding()
Return the name of the encoding used to convert between Unicode
- filenames and bytes filenames. For best compatibility, str should be
- used for filenames in all cases, although representing filenames as bytes
- is also supported. Functions accepting or returning filenames should support
- either str or bytes and internally convert to the system's preferred
- representation.
+ filenames and bytes filenames.
+
+ For best compatibility, str should be used for filenames in all cases,
+ although representing filenames as bytes is also supported. Functions
+ accepting or returning filenames should support either str or bytes and
+ internally convert to the system's preferred representation.
This encoding is always ASCII-compatible.
:func:`os.fsencode` and :func:`os.fsdecode` should be used to ensure that
the correct encoding and errors mode are used.
- * In the UTF-8 mode, the encoding is ``utf-8`` on any platform.
-
- * On macOS, the encoding is ``'utf-8'``.
-
- * On Unix, the encoding is the locale encoding.
-
- * On Windows, the encoding may be ``'utf-8'`` or ``'mbcs'``, depending
- on user configuration.
-
- * On Android, the encoding is ``'utf-8'``.
-
- * On VxWorks, the encoding is ``'utf-8'``.
+ The filesystem encoding is initialized from
+ :c:member:`PyConfig.filesystem_encoding`.
.. versionchanged:: 3.2
:func:`getfilesystemencoding` result cannot be ``None`` anymore.
@@ -660,6 +651,9 @@ always available.
:func:`os.fsencode` and :func:`os.fsdecode` should be used to ensure that
the correct encoding and errors mode are used.
+ The filesystem error handler is initialized from
+ :c:member:`PyConfig.filesystem_errors`.
+
.. versionadded:: 3.6
.. function:: getrefcount(object)
@@ -1457,6 +1451,9 @@ always available.
This is equivalent to defining the :envvar:`PYTHONLEGACYWINDOWSFSENCODING`
environment variable before launching Python.
+ See also :func:`sys.getfilesystemencoding` and
+ :func:`sys.getfilesystemencodeerrors`.
+
.. availability:: Windows.
.. versionadded:: 3.6