From 7a03f64c2e6e417fda46c725867a5edad929a66e Mon Sep 17 00:00:00 2001 From: Ezio Melotti Date: Tue, 25 Oct 2011 10:30:19 +0300 Subject: Remove mention of narrow/wide builds in the codecs doc. --- Doc/library/codecs.rst | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index 2a7abf9..fe09e05 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -787,11 +787,9 @@ methods and attributes from the underlying stream. Encodings and Unicode --------------------- -Strings are stored internally as sequences of codepoints (to be precise -as :c:type:`Py_UNICODE` arrays). Depending on the way Python is compiled (either -via ``--without-wide-unicode`` or ``--with-wide-unicode``, with the -former being the default) :c:type:`Py_UNICODE` is either a 16-bit or 32-bit data -type. Once a string object is used outside of CPU and memory, CPU endianness +Strings are stored internally as sequences of codepoints in range ``0 - 10FFFF`` +(see :pep:`393` for more details about the implementation). +Once a string object is used outside of CPU and memory, CPU endianness and how these arrays are stored as bytes become an issue. Transforming a string object into a sequence of bytes is called encoding and recreating the string object from the sequence of bytes is known as decoding. There are many -- cgit v0.12