summaryrefslogtreecommitdiffstats
path: root/Doc/library/codecs.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/codecs.rst')
-rw-r--r--Doc/library/codecs.rst8
1 files changed, 3 insertions, 5 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 2a7abf9..fe09e05 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -787,11 +787,9 @@ methods and attributes from the underlying stream.
Encodings and Unicode
---------------------
-Strings are stored internally as sequences of codepoints (to be precise
-as :c:type:`Py_UNICODE` arrays). Depending on the way Python is compiled (either
-via ``--without-wide-unicode`` or ``--with-wide-unicode``, with the
-former being the default) :c:type:`Py_UNICODE` is either a 16-bit or 32-bit data
-type. Once a string object is used outside of CPU and memory, CPU endianness
+Strings are stored internally as sequences of codepoints in range ``0 - 10FFFF``
+(see :pep:`393` for more details about the implementation).
+Once a string object is used outside of CPU and memory, CPU endianness
and how these arrays are stored as bytes become an issue. Transforming a
string object into a sequence of bytes is called encoding and recreating the
string object from the sequence of bytes is known as decoding. There are many