summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Doc/c-api/unicode.rst12
-rw-r--r--Include/unicodeobject.h24
2 files changed, 15 insertions, 21 deletions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 0c9ea8f..1ed8140 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -556,14 +556,13 @@ APIs:
.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, \
const char *encoding, const char *errors)
- Coerce an encoded object *obj* to a Unicode object and return a reference with
- incremented refcount.
+ Decode an encoded object *obj* to a Unicode object.
:class:`bytes`, :class:`bytearray` and other
:term:`bytes-like objects <bytes-like object>`
are decoded according to the given *encoding* and using the error handling
defined by *errors*. Both can be *NULL* to have the interface use the default
- values (see the next section for details).
+ values (see :ref:`builtincodecs` for details).
All other objects, including Unicode objects, cause a :exc:`TypeError` to be
set.
@@ -745,8 +744,11 @@ Extension modules can continue using them, as they will not be removed in Python
.. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj)
- Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used
- throughout the interpreter whenever coercion to Unicode is needed.
+ Copy an instance of a Unicode subtype to a new true Unicode object if
+ necessary. If *obj* is already a true Unicode object (not a subtype),
+ return the reference with incremented refcount.
+
+ Objects other than Unicode or its subtypes will cause a :exc:`TypeError`.
Locale Encoding
diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h
index 4e8e3ec..ea4591d 100644
--- a/Include/unicodeobject.h
+++ b/Include/unicodeobject.h
@@ -844,17 +844,13 @@ PyAPI_FUNC(int) PyUnicode_Resize(
Py_ssize_t length /* New length */
);
-/* Coerce obj to a Unicode object and return a reference with
- *incremented* refcount.
+/* Decode obj to an Unicode object.
- Coercion is done in the following way:
+ bytes, bytearray and other bytes-like objects are decoded according to the
+ given encoding and error handler. The encoding and error handler can be
+ NULL to have the interface use UTF-8 and "strict".
- 1. bytes, bytearray and other bytes-like objects are decoded
- under the assumptions that they contain data using the UTF-8
- encoding. Decoding is done in "strict" mode.
-
- 2. All other objects (including Unicode objects) raise an
- exception.
+ All other objects (including Unicode objects) raise an exception.
The API returns NULL in case of an error. The caller is responsible
for decref'ing the returned objects.
@@ -867,13 +863,9 @@ PyAPI_FUNC(PyObject*) PyUnicode_FromEncodedObject(
const char *errors /* error handling */
);
-/* Coerce obj to a Unicode object and return a reference with
- *incremented* refcount.
-
- Unicode objects are passed back as-is (subclasses are converted to
- true Unicode objects), all other objects are delegated to
- PyUnicode_FromEncodedObject(obj, NULL, "strict") which results in
- using UTF-8 encoding as basis for decoding the object.
+/* Copy an instance of a Unicode subtype to a new true Unicode object if
+ necessary. If obj is already a true Unicode object (not a subtype), return
+ the reference with *incremented* refcount.
The API returns NULL in case of an error. The caller is responsible
for decref'ing the returned objects.