#6930: clarify description about byteorder handling in UTF decoder routines.

author: Georg Brandl <georg@python.org> 2009-09-18 21:35:59 (GMT)
committer: Georg Brandl <georg@python.org> 2009-09-18 21:35:59 (GMT)
commit: 579a358e61292774a9ab57fe7e92441777e48be4 (patch)
tree: 86df4848025ab7d5f4e0e70f4b31851ecba918dc
parent: 54967d994ab79795a64bda099c0288976966ff86 (diff)
download: cpython-579a358e61292774a9ab57fe7e92441777e48be4.zip
cpython-579a358e61292774a9ab57fe7e92441777e48be4.tar.gz
cpython-579a358e61292774a9ab57fe7e92441777e48be4.tar.bz2
1 files changed, 17 insertions, 12 deletions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 1249ed7..4ab1c21 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -414,10 +414,13 @@ These are the UTF-32 codec APIs:
       *byteorder == 0:  native order
       *byteorder == 1:  big endian
 
-   and then switches if the first four bytes of the input data are a byte order mark
-   (BOM) and the specified byte order is native order.  This BOM is not copied into
-   the resulting Unicode string.  After completion, *\*byteorder* is set to the
-   current byte order at the end of input data.
+   If ``*byteorder`` is zero, and the first four bytes of the input data are a
+   byte order mark (BOM), the decoder switches to this byte order and the BOM is
+   not copied into the resulting Unicode string.  If ``*byteorder`` is ``-1`` or
+   ``1``, any byte order mark is copied to the output.
+
+   After completion, *\*byteorder* is set to the current byte order at the end
+   of input data.
 
    In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
 
@@ -442,8 +445,7 @@ These are the UTF-32 codec APIs:
 .. cfunction:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
 
    Return a Python bytes object holding the UTF-32 encoded value of the Unicode
-   data in *s*.  If *byteorder* is not ``0``, output is written according to the
-   following byte order::
+   data in *s*.  Output is written according to the following byte order::
 
       byteorder == -1: little endian
       byteorder == 0:  native byte order (writes a BOM mark)
@@ -487,10 +489,14 @@ These are the UTF-16 codec APIs:
       *byteorder == 0:  native order
       *byteorder == 1:  big endian
 
-   and then switches if the first two bytes of the input data are a byte order mark
-   (BOM) and the specified byte order is native order.  This BOM is not copied into
-   the resulting Unicode string.  After completion, *\*byteorder* is set to the
-   current byte order at the.
+   If ``*byteorder`` is zero, and the first two bytes of the input data are a
+   byte order mark (BOM), the decoder switches to this byte order and the BOM is
+   not copied into the resulting Unicode string.  If ``*byteorder`` is ``-1`` or
+   ``1``, any byte order mark is copied to the output (where it will result in
+   either a ``\ufeff`` or a ``\ufffe`` character).
+
+   After completion, *\*byteorder* is set to the current byte order at the end
+   of input data.
 
    If *byteorder* is *NULL*, the codec starts in native order mode.
 
@@ -520,8 +526,7 @@ These are the UTF-16 codec APIs:
 .. cfunction:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
 
    Return a Python string object holding the UTF-16 encoded value of the Unicode
-   data in *s*.  If *byteorder* is not ``0``, output is written according to the
-   following byte order::
+   data in *s*.  Output is written according to the following byte order::
 
       byteorder == -1: little endian
       byteorder == 0:  native byte order (writes a BOM mark)
author	Georg Brandl <georg@python.org>	2009-09-18 21:35:59 (GMT)
committer	Georg Brandl <georg@python.org>	2009-09-18 21:35:59 (GMT)
commit	579a358e61292774a9ab57fe7e92441777e48be4 (patch)
tree	86df4848025ab7d5f4e0e70f4b31851ecba918dc
parent	54967d994ab79795a64bda099c0288976966ff86 (diff)
download	cpython-579a358e61292774a9ab57fe7e92441777e48be4.zip cpython-579a358e61292774a9ab57fe7e92441777e48be4.tar.gz cpython-579a358e61292774a9ab57fe7e92441777e48be4.tar.bz2