summaryrefslogtreecommitdiffstats
path: root/Doc/tutorial
diff options
context:
space:
mode:
authorInada Naoki <songofacandy@gmail.com>2022-05-02 08:25:05 (GMT)
committerGitHub <noreply@github.com>2022-05-02 08:25:05 (GMT)
commit614420df9796c8a4f01e24052fc0128b4c20c5bf (patch)
treed794593cb9d1291846c69d86ae3420e0d3824b7e /Doc/tutorial
parentd414f7ece8169097a32cd228bb32da0418833db4 (diff)
downloadcpython-614420df9796c8a4f01e24052fc0128b4c20c5bf.zip
cpython-614420df9796c8a4f01e24052fc0128b4c20c5bf.tar.gz
cpython-614420df9796c8a4f01e24052fc0128b4c20c5bf.tar.bz2
gh-85679: Recommend `encoding="utf-8"` in tutorial (GH-91778)
Diffstat (limited to 'Doc/tutorial')
-rw-r--r--Doc/tutorial/inputoutput.rst28
1 files changed, 18 insertions, 10 deletions
diff --git a/Doc/tutorial/inputoutput.rst b/Doc/tutorial/inputoutput.rst
index 7f83c4d..b500636 100644
--- a/Doc/tutorial/inputoutput.rst
+++ b/Doc/tutorial/inputoutput.rst
@@ -279,11 +279,12 @@ Reading and Writing Files
object: file
:func:`open` returns a :term:`file object`, and is most commonly used with
-two arguments: ``open(filename, mode)``.
+two positional arguments and one keyword argument:
+``open(filename, mode, encoding=None)``
::
- >>> f = open('workfile', 'w')
+ >>> f = open('workfile', 'w', encoding="utf-8")
.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
@@ -300,11 +301,14 @@ writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
omitted.
Normally, files are opened in :dfn:`text mode`, that means, you read and write
-strings from and to the file, which are encoded in a specific encoding. If
-encoding is not specified, the default is platform dependent (see
-:func:`open`). ``'b'`` appended to the mode opens the file in
-:dfn:`binary mode`: now the data is read and written in the form of bytes
-objects. This mode should be used for all files that don't contain text.
+strings from and to the file, which are encoded in a specific *encoding*.
+If *encoding* is not specified, the default is platform dependent
+(see :func:`open`).
+Because UTF-8 is the modern de-facto standard, ``encoding="utf-8"`` is
+recommended unless you know that you need to use a different encoding.
+Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`.
+Binary mode data is read and written as :class:`bytes` objects.
+You can not specify *encoding* when opening file in binary mode.
In text mode, the default when reading is to convert platform-specific line
endings (``\n`` on Unix, ``\r\n`` on Windows) to just ``\n``. When writing in
@@ -320,7 +324,7 @@ after its suite finishes, even if an exception is raised at some
point. Using :keyword:`!with` is also much shorter than writing
equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
- >>> with open('workfile') as f:
+ >>> with open('workfile', encoding="utf-8") as f:
... read_data = f.read()
>>> # We can check that the file has been automatically closed.
@@ -490,11 +494,15 @@ simply serializes the object to a :term:`text file`. So if ``f`` is a
json.dump(x, f)
-To decode the object again, if ``f`` is a :term:`text file` object which has
-been opened for reading::
+To decode the object again, if ``f`` is a :term:`binary file` or
+:term:`text file` object which has been opened for reading::
x = json.load(f)
+.. note::
+ JSON files must be encoded in UTF-8. Use ``encoding="utf-8"`` when opening
+ JSON file as a :term:`text file` for both of reading and writing.
+
This simple serialization technique can handle lists and dictionaries, but
serializing arbitrary class instances in JSON requires a bit of extra effort.
The reference for the :mod:`json` module contains an explanation of this.