diff options
author | Victor Stinner <vstinner@python.org> | 2020-11-01 22:07:23 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-11-01 22:07:23 (GMT) |
commit | e662c398d87f136497f8ec672e83657ae3a599e0 (patch) | |
tree | cc9383c30557769a096be580b7f8f1b936565ea9 /Doc/includes | |
parent | 82458b6cdbae3b849dc11d0d7dc2ab06ef0451c4 (diff) | |
download | cpython-e662c398d87f136497f8ec672e83657ae3a599e0.zip cpython-e662c398d87f136497f8ec672e83657ae3a599e0.tar.gz cpython-e662c398d87f136497f8ec672e83657ae3a599e0.tar.bz2 |
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.
In May 2010 (commit b744ba1d14c5487576c95d0311e357b707600b47), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.
In August 2020 (commit 94908bbc1503df830d1d615e7b57744ae1b41079), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.
In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.
Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.
While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)
Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
Diffstat (limited to 'Doc/includes')
0 files changed, 0 insertions, 0 deletions