summaryrefslogtreecommitdiffstats
path: root/Doc/using/windows.rst
diff options
context:
space:
mode:
authorInada Naoki <songofacandy@gmail.com>2020-01-28 10:12:31 (GMT)
committerGitHub <noreply@github.com>2020-01-28 10:12:31 (GMT)
commit148610d88a2785751ed435a4e60f07a9f1bc50a6 (patch)
tree4cfb46bd1f8eacda55cdb092cb65cb1f05453bbd /Doc/using/windows.rst
parent13c1c3556f2c12d0be2af890fabfbf44280b845c (diff)
downloadcpython-148610d88a2785751ed435a4e60f07a9f1bc50a6.zip
cpython-148610d88a2785751ed435a4e60f07a9f1bc50a6.tar.gz
cpython-148610d88a2785751ed435a4e60f07a9f1bc50a6.tar.bz2
bpo-39287: Doc: Add UTF-8 mode section in using/windows. (GH-17935)
Co-Authored-By: Kyle Stanley <aeros167@gmail.com>
Diffstat (limited to 'Doc/using/windows.rst')
-rw-r--r--Doc/using/windows.rst44
1 files changed, 44 insertions, 0 deletions
diff --git a/Doc/using/windows.rst b/Doc/using/windows.rst
index 4912048..97e9cdf 100644
--- a/Doc/using/windows.rst
+++ b/Doc/using/windows.rst
@@ -602,6 +602,50 @@ existed)::
C:\WINDOWS\system32;C:\WINDOWS;C:\Program Files\Python 3.9
+.. _win-utf8-mode:
+
+UTF-8 mode
+==========
+
+.. versionadded:: 3.7
+
+Windows still uses legacy encodings for the system encoding (the ANSI Code
+Page). Python uses it for the default encoding of text files (e.g.
+:func:`locale.getpreferredencoding`).
+
+This may cause issues because UTF-8 is widely used on the internet
+and most Unix systems, including WSL (Windows Subsystem for Linux).
+
+You can use UTF-8 mode to change the default text encoding to UTF-8.
+You can enable UTF-8 mode via the ``-X utf8`` command line option, or
+the ``PYTHONUTF8=1`` environment variable. See :envvar:`PYTHONUTF8` for
+enabling UTF-8 mode, and :ref:`setting-envvars` for how to modify
+environment variables.
+
+When UTF-8 mode is enabled:
+
+* :func:`locale.getpreferredencoding` returns ``'UTF-8'`` instead of
+ the system encoding. This function is used for the default text
+ encoding in many places, including :func:`open`, :class:`Popen`,
+ :meth:`Path.read_text`, etc.
+* :data:`sys.stdin`, :data:`sys.stdout`, and :data:`sys.stderr`
+ all use UTF-8 as their text encoding.
+* You can still use the system encoding via the "mbcs" codec.
+
+Note that adding ``PYTHONUTF8=1`` to the default environment variables
+will affect all Python 3.7+ applications on your system.
+If you have any Python 3.7+ applications which rely on the legacy
+system encoding, it is recommended to set the environment variable
+temporarily or use the ``-X utf8`` command line option.
+
+.. note::
+ Even when UTF-8 mode is disabled, Python uses UTF-8 by default
+ on Windows for:
+
+ * Console I/O including standard I/O (see :pep:`528` for details).
+ * The filesystem encoding (see :pep:`529` for details).
+
+
.. _launcher:
Python Launcher for Windows