summaryrefslogtreecommitdiffstats
path: root/Doc/whatsnew
diff options
context:
space:
mode:
authorNick Coghlan <ncoghlan@gmail.com>2018-06-09 06:54:08 (GMT)
committerGitHub <noreply@github.com>2018-06-09 06:54:08 (GMT)
commit1bcb8a636857e3383d65aaf196f93edb949f2e79 (patch)
treeeab246936f3762798c97009417d676133951c3cb /Doc/whatsnew
parent4acc140f8d2c905197362d0ffec545a412ab32a7 (diff)
downloadcpython-1bcb8a636857e3383d65aaf196f93edb949f2e79.zip
cpython-1bcb8a636857e3383d65aaf196f93edb949f2e79.tar.gz
cpython-1bcb8a636857e3383d65aaf196f93edb949f2e79.tar.bz2
bpo-33409: Clarify PEP 538/540 relationship (GH-7534)
While locale coercion and UTF-8 mode turned out to be complementary ideas rather than competing ones, it isn't immediately obvious why it's useful to have both, or how they interact at runtime. This updates both the Python 3.7 What's New doc and the PYTHONCOERCECLOCALE and PYTHONUTF8 documentation in an attempt to clarify that relationship: - in the respective What's New sections, add a closing paragraph explaining which problem each one solves, and pointing to the other PEP's section for the specific aspects it relies on the other PEP to solve - use "locale-aware mode" as a more descriptive term for the default non-UTF-8 mode - improve wording conistenccy between the PYTHONCOERCECLOCALE and PYTHONUTF8 docs when they cover the same thing (mostly related to legacy locale detection and setting the standard stream error handler) - improve the description of the locale coercion trigger conditions (including pointing out that setting LC_ALL turns off locale coercion) - port the full description of the UTF-8 mode behaviour changes from PEP 540 into the PYTHONUTF8 documentation - be explicit that PYTHONIOENCODING still overrides the settings for the standard streams - mention concrete examples of things that do and don't get their text encoding assumptions adjusted by the two text encoding assumption override techniques
Diffstat (limited to 'Doc/whatsnew')
-rw-r--r--Doc/whatsnew/3.7.rst31
1 files changed, 25 insertions, 6 deletions
diff --git a/Doc/whatsnew/3.7.rst b/Doc/whatsnew/3.7.rst
index 8a3afdf..762d84a 100644
--- a/Doc/whatsnew/3.7.rst
+++ b/Doc/whatsnew/3.7.rst
@@ -97,9 +97,10 @@ Significant improvements in the standard library:
CPython implementation improvements:
+* Avoiding the use of ASCII as a default text encoding:
+ * :ref:`PEP 538 <whatsnew37-pep538>`, legacy C locale coercion
+ * :ref:`PEP 540 <whatsnew37-pep540>`, forced UTF-8 runtime mode
* :ref:`PEP 552 <whatsnew37-pep552>`, deterministic .pycs
-* :ref:`PEP 538 <whatsnew37-pep538>`, legacy C locale coercion
-* :ref:`PEP 540 <whatsnew37-pep540>`, forced UTF-8 runtime mode
* :ref:`the new development runtime mode <whatsnew37-devmode>`
* :ref:`PEP 565 <whatsnew37-pep565>`, improved :exc:`DeprecationWarning`
handling
@@ -184,7 +185,8 @@ PEP 538: Legacy C Locale Coercion
An ongoing challenge within the Python 3 series has been determining a sensible
default strategy for handling the "7-bit ASCII" text encoding assumption
-currently implied by the use of the default C locale on non-Windows platforms.
+currently implied by the use of the default C or POSIX locale on non-Windows
+platforms.
:pep:`538` updates the default interpreter command line interface to
automatically coerce that locale to an available UTF-8 based locale as
@@ -205,10 +207,18 @@ continues to be ``backslashreplace``, regardless of locale.
Locale coercion is silent by default, but to assist in debugging potentially
locale related integration problems, explicit warnings (emitted directly on
-:data:`~sys.stderr` can be requested by setting ``PYTHONCOERCECLOCALE=warn``.
+:data:`~sys.stderr`) can be requested by setting ``PYTHONCOERCECLOCALE=warn``.
This setting will also cause the Python runtime to emit a warning if the
legacy C locale remains active when the core interpreter is initialized.
+While :pep:`538`'s locale coercion has the benefit of also affecting extension
+modules (such as GNU ``readline``), as well as child processes (including those
+running non-Python applications and older versions of Python), it has the
+downside of requiring that a suitable target locale be present on the running
+system. To better handle the case where no suitable target locale is available
+(as occurs on RHEL/CentOS 7, for example), Python 3.7 also implements
+:ref:`whatsnew37-pep540`.
+
.. seealso::
:pep:`538` -- Coercing the legacy C locale to a UTF-8 based locale
@@ -231,8 +241,17 @@ The forced UTF-8 mode can be used to change the text handling behavior in
an embedded Python interpreter without changing the locale settings of
an embedding application.
-The UTF-8 mode is enabled by default when the locale is "C". See
-:ref:`whatsnew37-pep538` for details.
+While :pep:`540`'s UTF-8 mode has the benefit of working regardless of which
+locales are available on the running system, it has the downside of having no
+effect on extension modules (such as GNU ``readline``), child processes running
+non-Python applications, and child processes running older versions of Python.
+To reduce the risk of corrupting text data when communicating with such
+components, Python 3.7 also implements :ref:`whatsnew37-pep540`).
+
+The UTF-8 mode is enabled by default when the locale is ``C`` or ``POSIX``, and
+the :pep:`538` locale coercion feature fails to change it to a UTF-8 based
+alternative (whether that failure is due to ``PYTHONCOERCECLOCALE=0`` being set,
+``LC_ALL`` being set, or the lack of a suitable target locale).
.. seealso::