diff options
-rw-r--r-- | Doc/c-api/unicode.rst | 12 | ||||
-rw-r--r-- | Include/unicodeobject.h | 24 |
2 files changed, 15 insertions, 21 deletions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst index 0c9ea8f..1ed8140 100644 --- a/Doc/c-api/unicode.rst +++ b/Doc/c-api/unicode.rst @@ -556,14 +556,13 @@ APIs: .. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, \ const char *encoding, const char *errors) - Coerce an encoded object *obj* to a Unicode object and return a reference with - incremented refcount. + Decode an encoded object *obj* to a Unicode object. :class:`bytes`, :class:`bytearray` and other :term:`bytes-like objects <bytes-like object>` are decoded according to the given *encoding* and using the error handling defined by *errors*. Both can be *NULL* to have the interface use the default - values (see the next section for details). + values (see :ref:`builtincodecs` for details). All other objects, including Unicode objects, cause a :exc:`TypeError` to be set. @@ -745,8 +744,11 @@ Extension modules can continue using them, as they will not be removed in Python .. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj) - Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used - throughout the interpreter whenever coercion to Unicode is needed. + Copy an instance of a Unicode subtype to a new true Unicode object if + necessary. If *obj* is already a true Unicode object (not a subtype), + return the reference with incremented refcount. + + Objects other than Unicode or its subtypes will cause a :exc:`TypeError`. Locale Encoding diff --git a/Include/unicodeobject.h b/Include/unicodeobject.h index 4e8e3ec..ea4591d 100644 --- a/Include/unicodeobject.h +++ b/Include/unicodeobject.h @@ -844,17 +844,13 @@ PyAPI_FUNC(int) PyUnicode_Resize( Py_ssize_t length /* New length */ ); -/* Coerce obj to a Unicode object and return a reference with - *incremented* refcount. +/* Decode obj to an Unicode object. - Coercion is done in the following way: + bytes, bytearray and other bytes-like objects are decoded according to the + given encoding and error handler. The encoding and error handler can be + NULL to have the interface use UTF-8 and "strict". - 1. bytes, bytearray and other bytes-like objects are decoded - under the assumptions that they contain data using the UTF-8 - encoding. Decoding is done in "strict" mode. - - 2. All other objects (including Unicode objects) raise an - exception. + All other objects (including Unicode objects) raise an exception. The API returns NULL in case of an error. The caller is responsible for decref'ing the returned objects. @@ -867,13 +863,9 @@ PyAPI_FUNC(PyObject*) PyUnicode_FromEncodedObject( const char *errors /* error handling */ ); -/* Coerce obj to a Unicode object and return a reference with - *incremented* refcount. - - Unicode objects are passed back as-is (subclasses are converted to - true Unicode objects), all other objects are delegated to - PyUnicode_FromEncodedObject(obj, NULL, "strict") which results in - using UTF-8 encoding as basis for decoding the object. +/* Copy an instance of a Unicode subtype to a new true Unicode object if + necessary. If obj is already a true Unicode object (not a subtype), return + the reference with *incremented* refcount. The API returns NULL in case of an error. The caller is responsible for decref'ing the returned objects. |