summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorGeorg Brandl <georg@python.org>2015-01-14 07:26:30 (GMT)
committerGeorg Brandl <georg@python.org>2015-01-14 07:26:30 (GMT)
commit3be472b5f777fe5ebc0c1f4b6c0d96c73352db9c (patch)
treeaddfeeb14af6240b6454926ef9a4cce2a51f9207
parent1a8ada89f9b3d9b10654adce979046d865906a44 (diff)
downloadcpython-3be472b5f777fe5ebc0c1f4b6c0d96c73352db9c.zip
cpython-3be472b5f777fe5ebc0c1f4b6c0d96c73352db9c.tar.gz
cpython-3be472b5f777fe5ebc0c1f4b6c0d96c73352db9c.tar.bz2
Closes #23181: codepoint -> code point
-rw-r--r--Doc/c-api/unicode.rst2
-rw-r--r--Doc/library/codecs.rst12
-rw-r--r--Doc/library/email.mime.rst2
-rw-r--r--Doc/library/functions.rst2
-rw-r--r--Doc/library/html.entities.rst4
-rw-r--r--Doc/tutorial/datastructures.rst2
-rw-r--r--Doc/whatsnew/3.3.rst12
7 files changed, 18 insertions, 18 deletions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index ed74f45..00063d0 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1141,7 +1141,7 @@ These are the UTF-32 codec APIs:
mark (U+FEFF). In the other two modes, no BOM mark is prepended.
If *Py_UNICODE_WIDE* is not defined, surrogate pairs will be output
- as a single codepoint.
+ as a single code point.
Return *NULL* if an exception was raised by the codec.
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index b67e653..3510f69 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -841,7 +841,7 @@ methods and attributes from the underlying stream.
Encodings and Unicode
---------------------
-Strings are stored internally as sequences of codepoints in
+Strings are stored internally as sequences of code points in
range ``0x0``-``0x10FFFF``. (See :pep:`393` for
more details about the implementation.)
Once a string object is used outside of CPU and memory, endianness
@@ -852,23 +852,23 @@ There are a variety of different text serialisation codecs, which are
collectivity referred to as :term:`text encodings <text encoding>`.
The simplest text encoding (called ``'latin-1'`` or ``'iso-8859-1'``) maps
-the codepoints 0-255 to the bytes ``0x0``-``0xff``, which means that a string
-object that contains codepoints above ``U+00FF`` can't be encoded with this
+the code points 0-255 to the bytes ``0x0``-``0xff``, which means that a string
+object that contains code points above ``U+00FF`` can't be encoded with this
codec. Doing so will raise a :exc:`UnicodeEncodeError` that looks
like the following (although the details of the error message may differ):
``UnicodeEncodeError: 'latin-1' codec can't encode character '\u1234' in
position 3: ordinal not in range(256)``.
There's another group of encodings (the so called charmap encodings) that choose
-a different subset of all Unicode code points and how these codepoints are
+a different subset of all Unicode code points and how these code points are
mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open
e.g. :file:`encodings/cp1252.py` (which is an encoding that is used primarily on
Windows). There's a string constant with 256 characters that shows you which
character is mapped to which byte value.
-All of these encodings can only encode 256 of the 1114112 codepoints
+All of these encodings can only encode 256 of the 1114112 code points
defined in Unicode. A simple and straightforward way that can store each Unicode
-code point, is to store each codepoint as four consecutive bytes. There are two
+code point, is to store each code point as four consecutive bytes. There are two
possibilities: store the bytes in big endian or in little endian order. These
two encodings are called ``UTF-32-BE`` and ``UTF-32-LE`` respectively. Their
disadvantage is that if e.g. you use ``UTF-32-BE`` on a little endian machine you
diff --git a/Doc/library/email.mime.rst b/Doc/library/email.mime.rst
index 950b1c6..67d0a67 100644
--- a/Doc/library/email.mime.rst
+++ b/Doc/library/email.mime.rst
@@ -194,7 +194,7 @@ Here are the classes:
minor type and defaults to :mimetype:`plain`. *_charset* is the character
set of the text and is passed as an argument to the
:class:`~email.mime.nonmultipart.MIMENonMultipart` constructor; it defaults
- to ``us-ascii`` if the string contains only ``ascii`` codepoints, and
+ to ``us-ascii`` if the string contains only ``ascii`` code points, and
``utf-8`` otherwise. The *_charset* parameter accepts either a string or a
:class:`~email.charset.Charset` instance.
diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst
index 8a0c336..c6b66b5 100644
--- a/Doc/library/functions.rst
+++ b/Doc/library/functions.rst
@@ -156,7 +156,7 @@ are always available. They are listed here in alphabetical order.
.. function:: chr(i)
- Return the string representing a character whose Unicode codepoint is the
+ Return the string representing a character whose Unicode code point is the
integer *i*. For example, ``chr(97)`` returns the string ``'a'``, while
``chr(931)`` returns the string ``'Σ'``. This is the inverse of :func:`ord`.
diff --git a/Doc/library/html.entities.rst b/Doc/library/html.entities.rst
index 09b0abc..e10e46e 100644
--- a/Doc/library/html.entities.rst
+++ b/Doc/library/html.entities.rst
@@ -33,12 +33,12 @@ This module defines four dictionaries, :data:`html5`,
.. data:: name2codepoint
- A dictionary that maps HTML entity names to the Unicode codepoints.
+ A dictionary that maps HTML entity names to the Unicode code points.
.. data:: codepoint2name
- A dictionary that maps Unicode codepoints to HTML entity names.
+ A dictionary that maps Unicode code points to HTML entity names.
.. rubric:: Footnotes
diff --git a/Doc/tutorial/datastructures.rst b/Doc/tutorial/datastructures.rst
index 6dc17aa..a2031ed 100644
--- a/Doc/tutorial/datastructures.rst
+++ b/Doc/tutorial/datastructures.rst
@@ -685,7 +685,7 @@ the same type, the lexicographical comparison is carried out recursively. If
all items of two sequences compare equal, the sequences are considered equal.
If one sequence is an initial sub-sequence of the other, the shorter sequence is
the smaller (lesser) one. Lexicographical ordering for strings uses the Unicode
-codepoint number to order individual characters. Some examples of comparisons
+code point number to order individual characters. Some examples of comparisons
between sequences of the same type::
(1, 2, 3) < (1, 2, 4)
diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst
index f8c3ca5..1d4ce72 100644
--- a/Doc/whatsnew/3.3.rst
+++ b/Doc/whatsnew/3.3.rst
@@ -228,7 +228,7 @@ Functionality
Changes introduced by :pep:`393` are the following:
-* Python now always supports the full range of Unicode codepoints, including
+* Python now always supports the full range of Unicode code points, including
non-BMP ones (i.e. from ``U+0000`` to ``U+10FFFF``). The distinction between
narrow and wide builds no longer exists and Python now behaves like a wide
build, even under Windows.
@@ -246,7 +246,7 @@ Changes introduced by :pep:`393` are the following:
so ``'\U0010FFFF'[0]`` now returns ``'\U0010FFFF'`` and not ``'\uDBFF'``;
* all other functions in the standard library now correctly handle
- non-BMP codepoints.
+ non-BMP code points.
* The value of :data:`sys.maxunicode` is now always ``1114111`` (``0x10FFFF``
in hexadecimal). The :c:func:`PyUnicode_GetMax` function still returns
@@ -258,13 +258,13 @@ Changes introduced by :pep:`393` are the following:
Performance and resource usage
------------------------------
-The storage of Unicode strings now depends on the highest codepoint in the string:
+The storage of Unicode strings now depends on the highest code point in the string:
-* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per codepoint;
+* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;
-* BMP strings (``U+0000-U+FFFF``) use 2 bytes per codepoint;
+* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
-* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per codepoint.
+* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.
The net effect is that for most applications, memory usage of string
storage should decrease significantly - especially compared to former