summaryrefslogtreecommitdiffstats
path: root/Doc/c-api
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/c-api')
-rw-r--r--Doc/c-api/arg.rst8
-rw-r--r--Doc/c-api/dict.rst7
-rw-r--r--Doc/c-api/exceptions.rst191
-rw-r--r--Doc/c-api/import.rst50
-rw-r--r--Doc/c-api/list.rst7
-rw-r--r--Doc/c-api/long.rst14
-rw-r--r--Doc/c-api/module.rst34
-rw-r--r--Doc/c-api/object.rst13
-rw-r--r--Doc/c-api/unicode.rst763
9 files changed, 842 insertions, 245 deletions
diff --git a/Doc/c-api/arg.rst b/Doc/c-api/arg.rst
index d4dda7c..a171ac7 100644
--- a/Doc/c-api/arg.rst
+++ b/Doc/c-api/arg.rst
@@ -260,9 +260,11 @@ Numbers
``n`` (:class:`int`) [Py_ssize_t]
Convert a Python integer to a C :c:type:`Py_ssize_t`.
-``c`` (:class:`bytes` of length 1) [char]
- Convert a Python byte, represented as a :class:`bytes` object of length 1,
- to a C :c:type:`char`.
+``c`` (:class:`bytes` or :class:`bytearray` of length 1) [char]
+ Convert a Python byte, represented as a :class:`bytes` or
+ :class:`bytearray` object of length 1, to a C :c:type:`char`.
+
+ .. versionchanged:: 3.3 Allow :class:`bytearray` objects
``C`` (:class:`str` of length 1) [int]
Convert a Python character, represented as a :class:`str` object of
diff --git a/Doc/c-api/dict.rst b/Doc/c-api/dict.rst
index 6df84e0..ac714a6 100644
--- a/Doc/c-api/dict.rst
+++ b/Doc/c-api/dict.rst
@@ -209,3 +209,10 @@ Dictionary Objects
for key, value in seq2:
if override or key not in a:
a[key] = value
+
+
+.. c:function:: int PyDict_ClearFreeList()
+
+ Clear the free list. Return the total number of freed items.
+
+ .. versionadded:: 3.3
diff --git a/Doc/c-api/exceptions.rst b/Doc/c-api/exceptions.rst
index 6f13c80..c7252ed 100644
--- a/Doc/c-api/exceptions.rst
+++ b/Doc/c-api/exceptions.rst
@@ -525,7 +525,7 @@ recursion depth automatically).
Marks a point where a recursive C-level call is about to be performed.
- If :const:`USE_STACKCHECK` is defined, this function checks if the the OS
+ If :const:`USE_STACKCHECK` is defined, this function checks if the OS
stack overflowed using :c:func:`PyOS_CheckStack`. In this is the case, it
sets a :exc:`MemoryError` and returns a nonzero value.
@@ -582,65 +582,116 @@ All standard Python exceptions are available as global variables whose names are
:c:type:`PyObject\*`; they are all class objects. For completeness, here are all
the variables:
-+-------------------------------------+----------------------------+----------+
-| C Name | Python Name | Notes |
-+=====================================+============================+==========+
-| :c:data:`PyExc_BaseException` | :exc:`BaseException` | \(1) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_Exception` | :exc:`Exception` | \(1) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | \(1) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_LookupError` | :exc:`LookupError` | \(1) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_AssertionError` | :exc:`AssertionError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_AttributeError` | :exc:`AttributeError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_EOFError` | :exc:`EOFError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_EnvironmentError` | :exc:`EnvironmentError` | \(1) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_FloatingPointError` | :exc:`FloatingPointError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_IOError` | :exc:`IOError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_ImportError` | :exc:`ImportError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_IndexError` | :exc:`IndexError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_KeyError` | :exc:`KeyError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_KeyboardInterrupt` | :exc:`KeyboardInterrupt` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_MemoryError` | :exc:`MemoryError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_NameError` | :exc:`NameError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_NotImplementedError` | :exc:`NotImplementedError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_OSError` | :exc:`OSError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_OverflowError` | :exc:`OverflowError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_ReferenceError` | :exc:`ReferenceError` | \(2) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_RuntimeError` | :exc:`RuntimeError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_SyntaxError` | :exc:`SyntaxError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_SystemError` | :exc:`SystemError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_SystemExit` | :exc:`SystemExit` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_TypeError` | :exc:`TypeError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_ValueError` | :exc:`ValueError` | |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_WindowsError` | :exc:`WindowsError` | \(3) |
-+-------------------------------------+----------------------------+----------+
-| :c:data:`PyExc_ZeroDivisionError` | :exc:`ZeroDivisionError` | |
-+-------------------------------------+----------------------------+----------+
++-----------------------------------------+---------------------------------+----------+
+| C Name | Python Name | Notes |
++=========================================+=================================+==========+
+| :c:data:`PyExc_BaseException` | :exc:`BaseException` | \(1) |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_Exception` | :exc:`Exception` | \(1) |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | \(1) |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_LookupError` | :exc:`LookupError` | \(1) |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_AssertionError` | :exc:`AssertionError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_AttributeError` | :exc:`AttributeError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_BlockingIOError` | :exc:`BlockingIOError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_BrokenPipeError` | :exc:`BrokenPipeError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ChildProcessError` | :exc:`ChildProcessError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ConnectionError` | :exc:`ConnectionError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ConnectionAbortedError` | :exc:`ConnectionAbortedError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ConnectionRefusedError` | :exc:`ConnectionRefusedError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ConnectionResetError` | :exc:`ConnectionResetError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_FileExistsError` | :exc:`FileExistsError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_FileNotFoundError` | :exc:`FileNotFoundError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_EOFError` | :exc:`EOFError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_FloatingPointError` | :exc:`FloatingPointError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ImportError` | :exc:`ImportError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_IndexError` | :exc:`IndexError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_InterruptedError` | :exc:`InterruptedError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_IsADirectoryError` | :exc:`IsADirectoryError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_KeyError` | :exc:`KeyError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_KeyboardInterrupt` | :exc:`KeyboardInterrupt` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_MemoryError` | :exc:`MemoryError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_NameError` | :exc:`NameError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_NotADirectoryError` | :exc:`NotADirectoryError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_NotImplementedError` | :exc:`NotImplementedError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_OSError` | :exc:`OSError` | \(1) |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_OverflowError` | :exc:`OverflowError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_PermissionError` | :exc:`PermissionError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ProcessLookupError` | :exc:`ProcessLookupError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ReferenceError` | :exc:`ReferenceError` | \(2) |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_RuntimeError` | :exc:`RuntimeError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_SyntaxError` | :exc:`SyntaxError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_SystemError` | :exc:`SystemError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_TimeoutError` | :exc:`TimeoutError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_SystemExit` | :exc:`SystemExit` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_TypeError` | :exc:`TypeError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ValueError` | :exc:`ValueError` | |
++-----------------------------------------+---------------------------------+----------+
+| :c:data:`PyExc_ZeroDivisionError` | :exc:`ZeroDivisionError` | |
++-----------------------------------------+---------------------------------+----------+
+
+.. versionadded:: 3.3
+ :c:data:`PyExc_BlockingIOError`, :c:data:`PyExc_BrokenPipeError`,
+ :c:data:`PyExc_ChildProcessError`, :c:data:`PyExc_ConnectionError`,
+ :c:data:`PyExc_ConnectionAbortedError`, :c:data:`PyExc_ConnectionRefusedError`,
+ :c:data:`PyExc_ConnectionResetError`, :c:data:`PyExc_FileExistsError`,
+ :c:data:`PyExc_FileNotFoundError`, :c:data:`PyExc_InterruptedError`,
+ :c:data:`PyExc_IsADirectoryError`, :c:data:`PyExc_NotADirectoryError`,
+ :c:data:`PyExc_PermissionError`, :c:data:`PyExc_ProcessLookupError`
+ and :c:data:`PyExc_TimeoutError` were introduced following :pep:`3151`.
+
+
+These are compatibility aliases to :c:data:`PyExc_OSError`:
+
++-------------------------------------+----------+
+| C Name | Notes |
++=====================================+==========+
+| :c:data:`PyExc_EnvironmentError` | |
++-------------------------------------+----------+
+| :c:data:`PyExc_IOError` | |
++-------------------------------------+----------+
+| :c:data:`PyExc_WindowsError` | \(3) |
++-------------------------------------+----------+
+
+.. versionchanged:: 3.3
+ These aliases used to be separate exception types.
+
.. index::
single: PyExc_BaseException
@@ -649,28 +700,42 @@ the variables:
single: PyExc_LookupError
single: PyExc_AssertionError
single: PyExc_AttributeError
+ single: PyExc_BlockingIOError
+ single: PyExc_BrokenPipeError
+ single: PyExc_ConnectionError
+ single: PyExc_ConnectionAbortedError
+ single: PyExc_ConnectionRefusedError
+ single: PyExc_ConnectionResetError
single: PyExc_EOFError
- single: PyExc_EnvironmentError
+ single: PyExc_FileExistsError
+ single: PyExc_FileNotFoundError
single: PyExc_FloatingPointError
- single: PyExc_IOError
single: PyExc_ImportError
single: PyExc_IndexError
+ single: PyExc_InterruptedError
+ single: PyExc_IsADirectoryError
single: PyExc_KeyError
single: PyExc_KeyboardInterrupt
single: PyExc_MemoryError
single: PyExc_NameError
+ single: PyExc_NotADirectoryError
single: PyExc_NotImplementedError
single: PyExc_OSError
single: PyExc_OverflowError
+ single: PyExc_PermissionError
+ single: PyExc_ProcessLookupError
single: PyExc_ReferenceError
single: PyExc_RuntimeError
single: PyExc_SyntaxError
single: PyExc_SystemError
single: PyExc_SystemExit
+ single: PyExc_TimeoutError
single: PyExc_TypeError
single: PyExc_ValueError
- single: PyExc_WindowsError
single: PyExc_ZeroDivisionError
+ single: PyExc_EnvironmentError
+ single: PyExc_IOError
+ single: PyExc_WindowsError
Notes:
diff --git a/Doc/c-api/import.rst b/Doc/c-api/import.rst
index cf48363..b168751 100644
--- a/Doc/c-api/import.rst
+++ b/Doc/c-api/import.rst
@@ -57,7 +57,7 @@ Importing Modules
:c:func:`PyImport_ImportModule`.
-.. c:function:: PyObject* PyImport_ImportModuleLevel(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level)
+.. c:function:: PyObject* PyImport_ImportModuleLevelObject(PyObject *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level)
Import a module. This is best described by referring to the built-in Python
function :func:`__import__`, as the standard :func:`__import__` function calls
@@ -68,6 +68,13 @@ Importing Modules
the return value when a submodule of a package was requested is normally the
top-level package, unless a non-empty *fromlist* was given.
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyObject* PyImport_ImportModuleLevel(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level)
+
+ Similar to :c:func:`PyImport_ImportModuleLevelObject`, but the name is an
+ UTF-8 encoded string instead of a Unicode object.
.. c:function:: PyObject* PyImport_Import(PyObject *name)
@@ -86,7 +93,7 @@ Importing Modules
an exception set on failure (the module still exists in this case).
-.. c:function:: PyObject* PyImport_AddModule(const char *name)
+.. c:function:: PyObject* PyImport_AddModuleObject(PyObject *name)
Return the module object corresponding to a module name. The *name* argument
may be of the form ``package.module``. First check the modules dictionary if
@@ -100,6 +107,14 @@ Importing Modules
or one of its variants to import a module. Package structures implied by a
dotted name for *name* are not created if not already present.
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyObject* PyImport_AddModule(const char *name)
+
+ Similar to :c:func:`PyImport_AddModuleObject`, but the name is a UTF-8
+ encoded string instead of a Unicode object.
+
.. c:function:: PyObject* PyImport_ExecCodeModule(char *name, PyObject *co)
@@ -136,14 +151,23 @@ Importing Modules
See also :c:func:`PyImport_ExecCodeModuleWithPathnames`.
-.. c:function:: PyObject* PyImport_ExecCodeModuleWithPathnames(char *name, PyObject *co, char *pathname, char *cpathname)
+.. c:function:: PyObject* PyImport_ExecCodeModuleObject(PyObject *name, PyObject *co, PyObject *pathname, PyObject *cpathname)
Like :c:func:`PyImport_ExecCodeModuleEx`, but the :attr:`__cached__`
attribute of the module object is set to *cpathname* if it is
non-``NULL``. Of the three functions, this is the preferred one to use.
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyObject* PyImport_ExecCodeModuleWithPathnames(char *name, PyObject *co, char *pathname, char *cpathname)
+
+ Like :c:func:`PyImport_ExecCodeModuleObject`, but *name*, *pathname* and
+ *cpathname* are UTF-8 encoded strings.
+
.. versionadded:: 3.2
+
.. c:function:: long PyImport_GetMagicNumber()
Return the magic number for Python bytecode files (a.k.a. :file:`.pyc` and
@@ -200,7 +224,7 @@ Importing Modules
For internal use only.
-.. c:function:: int PyImport_ImportFrozenModule(char *name)
+.. c:function:: int PyImport_ImportFrozenModuleObject(PyObject *name)
Load a frozen module named *name*. Return ``1`` for success, ``0`` if the
module is not found, and ``-1`` with an exception set if the initialization
@@ -208,6 +232,14 @@ Importing Modules
:c:func:`PyImport_ImportModule`. (Note the misnomer --- this function would
reload the module if it was already imported.)
+ .. versionadded:: 3.3
+
+
+.. c:function:: int PyImport_ImportFrozenModule(char *name)
+
+ Similar to :c:func:`PyImport_ImportFrozenModuleObject`, but the name is a
+ UTF-8 encoded string instead of a Unicode object.
+
.. c:type:: struct _frozen
@@ -247,13 +279,13 @@ Importing Modules
Structure describing a single entry in the list of built-in modules. Each of
these structures gives the name and initialization function for a module built
- into the interpreter. Programs which embed Python may use an array of these
- structures in conjunction with :c:func:`PyImport_ExtendInittab` to provide
- additional built-in modules. The structure is defined in
- :file:`Include/import.h` as::
+ into the interpreter. The name is an ASCII encoded string. Programs which
+ embed Python may use an array of these structures in conjunction with
+ :c:func:`PyImport_ExtendInittab` to provide additional built-in modules.
+ The structure is defined in :file:`Include/import.h` as::
struct _inittab {
- char *name;
+ char *name; /* ASCII encoded string */
PyObject* (*initfunc)(void);
};
diff --git a/Doc/c-api/list.rst b/Doc/c-api/list.rst
index feb9015..5b263a7 100644
--- a/Doc/c-api/list.rst
+++ b/Doc/c-api/list.rst
@@ -142,3 +142,10 @@ List Objects
Return a new tuple object containing the contents of *list*; equivalent to
``tuple(list)``.
+
+
+.. c:function:: int PyList_ClearFreeList()
+
+ Clear the free list. Return the total number of freed items.
+
+ .. versionadded:: 3.3
diff --git a/Doc/c-api/long.rst b/Doc/c-api/long.rst
index b2295e0..4c295fa 100644
--- a/Doc/c-api/long.rst
+++ b/Doc/c-api/long.rst
@@ -100,6 +100,20 @@ All integers are implemented as "long" integer objects of arbitrary size.
string is first encoded to a byte string using :c:func:`PyUnicode_EncodeDecimal`
and then converted using :c:func:`PyLong_FromString`.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyLong_FromUnicodeObject`.
+
+
+.. c:function:: PyObject* PyLong_FromUnicodeObject(PyObject *u, int base)
+
+ Convert a sequence of Unicode digits in the string *u* to a Python integer
+ value. The Unicode string is first encoded to a byte string using
+ :c:func:`PyUnicode_EncodeDecimal` and then converted using
+ :c:func:`PyLong_FromString`.
+
+ .. versionadded:: 3.3
+
.. c:function:: PyObject* PyLong_FromVoidPtr(void *p)
diff --git a/Doc/c-api/module.rst b/Doc/c-api/module.rst
index ffd68e3..32587be 100644
--- a/Doc/c-api/module.rst
+++ b/Doc/c-api/module.rst
@@ -29,7 +29,7 @@ There are only a few functions special to module objects.
:c:data:`PyModule_Type`.
-.. c:function:: PyObject* PyModule_New(const char *name)
+.. c:function:: PyObject* PyModule_NewObject(PyObject *name)
.. index::
single: __name__ (module attribute)
@@ -40,6 +40,14 @@ There are only a few functions special to module objects.
Only the module's :attr:`__doc__` and :attr:`__name__` attributes are filled in;
the caller is responsible for providing a :attr:`__file__` attribute.
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyObject* PyModule_New(const char *name)
+
+ Similar to :c:func:`PyImport_NewObject`, but the name is an UTF-8 encoded
+ string instead of a Unicode object.
+
.. c:function:: PyObject* PyModule_GetDict(PyObject *module)
@@ -52,7 +60,7 @@ There are only a few functions special to module objects.
manipulate a module's :attr:`__dict__`.
-.. c:function:: char* PyModule_GetName(PyObject *module)
+.. c:function:: PyObject* PyModule_GetNameObject(PyObject *module)
.. index::
single: __name__ (module attribute)
@@ -61,15 +69,13 @@ There are only a few functions special to module objects.
Return *module*'s :attr:`__name__` value. If the module does not provide one,
or if it is not a string, :exc:`SystemError` is raised and *NULL* is returned.
+ .. versionadded:: 3.3
-.. c:function:: char* PyModule_GetFilename(PyObject *module)
- Similar to :c:func:`PyModule_GetFilenameObject` but return the filename
- encoded to 'utf-8'.
+.. c:function:: char* PyModule_GetName(PyObject *module)
- .. deprecated:: 3.2
- :c:func:`PyModule_GetFilename` raises :c:type:`UnicodeEncodeError` on
- unencodable filenames, use :c:func:`PyModule_GetFilenameObject` instead.
+ Similar to :c:func:`PyModule_GetNameObject` but return the name encoded to
+ ``'utf-8'``.
.. c:function:: PyObject* PyModule_GetFilenameObject(PyObject *module)
@@ -81,11 +87,21 @@ There are only a few functions special to module objects.
Return the name of the file from which *module* was loaded using *module*'s
:attr:`__file__` attribute. If this is not defined, or if it is not a
unicode string, raise :exc:`SystemError` and return *NULL*; otherwise return
- a reference to a :c:type:`PyUnicodeObject`.
+ a reference to a Unicode object.
.. versionadded:: 3.2
+.. c:function:: char* PyModule_GetFilename(PyObject *module)
+
+ Similar to :c:func:`PyModule_GetFilenameObject` but return the filename
+ encoded to 'utf-8'.
+
+ .. deprecated:: 3.2
+ :c:func:`PyModule_GetFilename` raises :c:type:`UnicodeEncodeError` on
+ unencodable filenames, use :c:func:`PyModule_GetFilenameObject` instead.
+
+
.. c:function:: void* PyModule_GetState(PyObject *module)
Return the "state" of the module, that is, a pointer to the block of memory
diff --git a/Doc/c-api/object.rst b/Doc/c-api/object.rst
index d0d45ad..88ba5ac 100644
--- a/Doc/c-api/object.rst
+++ b/Doc/c-api/object.rst
@@ -6,6 +6,19 @@ Object Protocol
===============
+.. c:var:: PyObject* Py_NotImplemented
+
+ The ``NotImplemented`` singleton, used to signal that an operation is
+ not implemented for the given type combination.
+
+
+.. c:macro:: Py_RETURN_NOTIMPLEMENTED
+
+ Properly handle returning :c:data:`Py_NotImplemented` from within a C
+ function (that is, increment the reference count of NotImplemented and
+ return it).
+
+
.. c:function:: int PyObject_Print(PyObject *o, FILE *fp, int flags)
Print an object *o*, on file *fp*. Returns ``-1`` on error. The flags argument
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index f48eb73..d87ce0d 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -6,38 +6,72 @@ Unicode Objects and Codecs
--------------------------
.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
+.. sectionauthor:: Georg Brandl <georg@python.org>
Unicode Objects
^^^^^^^^^^^^^^^
+Since the implementation of :pep:`393` in Python 3.3, Unicode objects internally
+use a variety of representations, in order to allow handling the complete range
+of Unicode characters while staying memory efficient. There are special cases
+for strings where all code points are below 128, 256, or 65536; otherwise, code
+points must be below 1114112 (which is the full Unicode range).
+
+:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached
+in the Unicode object. The :c:type:`Py_UNICODE*` representation is deprecated
+and inefficient; it should be avoided in performance- or memory-sensitive
+situations.
+
+Due to the transition between the old APIs and the new APIs, unicode objects
+can internally be in two states depending on how they were created:
+
+* "canonical" unicode objects are all objects created by a non-deprecated
+ unicode API. They use the most efficient representation allowed by the
+ implementation.
+
+* "legacy" unicode objects have been created through one of the deprecated
+ APIs (typically :c:func:`PyUnicode_FromUnicode`) and only bear the
+ :c:type:`Py_UNICODE*` representation; you will have to call
+ :c:func:`PyUnicode_READY` on them before calling any other API.
+
+
Unicode Type
""""""""""""
These are the basic Unicode object types used for the Unicode implementation in
Python:
+.. c:type:: Py_UCS4
+ Py_UCS2
+ Py_UCS1
+
+ These types are typedefs for unsigned integer types wide enough to contain
+ characters of 32 bits, 16 bits and 8 bits, respectively. When dealing with
+ single Unicode characters, use :c:type:`Py_UCS4`.
+
+ .. versionadded:: 3.3
+
.. c:type:: Py_UNICODE
- This type represents the storage type which is used by Python internally as
- basis for holding Unicode ordinals. Python's default builds use a 16-bit type
- for :c:type:`Py_UNICODE` and store Unicode values internally as UCS2. It is also
- possible to build a UCS4 version of Python (most recent Linux distributions come
- with UCS4 builds of Python). These builds then use a 32-bit type for
- :c:type:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms
- where :c:type:`wchar_t` is available and compatible with the chosen Python
- Unicode build variant, :c:type:`Py_UNICODE` is a typedef alias for
- :c:type:`wchar_t` to enhance native platform compatibility. On all other
- platforms, :c:type:`Py_UNICODE` is a typedef alias for either :c:type:`unsigned
- short` (UCS2) or :c:type:`unsigned long` (UCS4).
+ This is a typedef of :c:type:`wchar_t`, which is a 16-bit type or 32-bit type
+ depending on the platform.
-Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep
-this in mind when writing extensions or interfaces.
+ .. versionchanged:: 3.3
+ In previous versions, this was a 16-bit type or a 32-bit type depending on
+ whether you selected a "narrow" or "wide" Unicode version of Python at
+ build time.
-.. c:type:: PyUnicodeObject
+.. c:type:: PyASCIIObject
+ PyCompactUnicodeObject
+ PyUnicodeObject
- This subtype of :c:type:`PyObject` represents a Python Unicode object.
+ These subtypes of :c:type:`PyObject` represent a Python Unicode object. In
+ almost all cases, they shouldn't be used directly, since all API functions
+ that deal with Unicode objects take and return :c:type:`PyObject` pointers.
+
+ .. versionadded:: 3.3
.. c:var:: PyTypeObject PyUnicode_Type
@@ -45,10 +79,10 @@ this in mind when writing extensions or interfaces.
This instance of :c:type:`PyTypeObject` represents the Python Unicode type. It
is exposed to Python code as ``str``.
+
The following APIs are really C macros and can be used to do fast checks and to
access internal read-only data of Unicode objects:
-
.. c:function:: int PyUnicode_Check(PyObject *o)
Return true if the object *o* is a Unicode object or an instance of a Unicode
@@ -61,28 +95,106 @@ access internal read-only data of Unicode objects:
subtype.
-.. c:function:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o)
+.. c:function:: int PyUnicode_READY(PyObject *o)
- Return the size of the object. *o* has to be a :c:type:`PyUnicodeObject` (not
- checked).
+ Ensure the string object *o* is in the "canonical" representation. This is
+ required before using any of the access macros described below.
+ .. XXX expand on when it is not required
-.. c:function:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o)
+ Returns 0 on success and -1 with an exception set on failure, which in
+ particular happens if memory allocation fails.
- Return the size of the object's internal buffer in bytes. *o* has to be a
- :c:type:`PyUnicodeObject` (not checked).
+ .. versionadded:: 3.3
-.. c:function:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o)
+.. c:function:: Py_ssize_t PyUnicode_GET_LENGTH(PyObject *o)
+
+ Return the length of the Unicode string, in code points. *o* has to be a
+ Unicode object in the "canonical" representation (not checked).
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: Py_UCS1* PyUnicode_1BYTE_DATA(PyObject *o)
+ Py_UCS2* PyUnicode_2BYTE_DATA(PyObject *o)
+ Py_UCS4* PyUnicode_4BYTE_DATA(PyObject *o)
+
+ Return a pointer to the canonical representation cast to UCS1, UCS2 or UCS4
+ integer types for direct character access. No checks are performed if the
+ canonical representation has the correct character size; use
+ :c:func:`PyUnicode_KIND` to select the right macro. Make sure
+ :c:func:`PyUnicode_READY` has been called before accessing this.
+
+ .. versionadded:: 3.3
+
+
+.. c:macro:: PyUnicode_WCHAR_KIND
+ PyUnicode_1BYTE_KIND
+ PyUnicode_2BYTE_KIND
+ PyUnicode_4BYTE_KIND
+
+ Return values of the :c:func:`PyUnicode_KIND` macro.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: int PyUnicode_KIND(PyObject *o)
+
+ Return one of the PyUnicode kind constants (see above) that indicate how many
+ bytes per character this Unicode object uses to store its data. *o* has to
+ be a Unicode object in the "canonical" representation (not checked).
+
+ .. XXX document "0" return value?
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: void* PyUnicode_DATA(PyObject *o)
+
+ Return a void pointer to the raw unicode buffer. *o* has to be a Unicode
+ object in the "canonical" representation (not checked).
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: void PyUnicode_WRITE(int kind, void *data, Py_ssize_t index, \
+ Py_UCS4 value)
+
+ Write into a canonical representation *data* (as obtained with
+ :c:func:`PyUnicode_DATA`). This macro does not do any sanity checks and is
+ intended for usage in loops. The caller should cache the *kind* value and
+ *data* pointer as obtained from other macro calls. *index* is the index in
+ the string (starts at 0) and *value* is the new code point value which should
+ be written to that location.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: Py_UCS4 PyUnicode_READ(int kind, void *data, Py_ssize_t index)
+
+ Read a code point from a canonical representation *data* (as obtained with
+ :c:func:`PyUnicode_DATA`). No checks or ready calls are performed.
- Return a pointer to the internal :c:type:`Py_UNICODE` buffer of the object. *o*
- has to be a :c:type:`PyUnicodeObject` (not checked).
+ .. versionadded:: 3.3
-.. c:function:: const char* PyUnicode_AS_DATA(PyObject *o)
+.. c:function:: Py_UCS4 PyUnicode_READ_CHAR(PyObject *o, Py_ssize_t index)
- Return a pointer to the internal buffer of the object. *o* has to be a
- :c:type:`PyUnicodeObject` (not checked).
+ Read a character from a Unicode object *o*, which must be in the "canonical"
+ representation. This is less efficient than :c:func:`PyUnicode_READ` if you
+ do multiple consecutive reads.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyUnicode_MAX_CHAR_VALUE(PyObject *o)
+
+ Return the maximum code point that is suitable for creating another string
+ based on *o*, which must be in the "canonical" representation. This is
+ always an approximation but more efficient than iterating over the string.
+
+ .. versionadded:: 3.3
.. c:function:: int PyUnicode_ClearFreeList()
@@ -90,6 +202,46 @@ access internal read-only data of Unicode objects:
Clear the free list. Return the total number of freed items.
+.. c:function:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o)
+
+ Return the size of the deprecated :c:type:`Py_UNICODE` representation, in
+ code units (this includes surrogate pairs as 2 units). *o* has to be a
+ Unicode object (not checked).
+
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style Unicode API, please migrate to using
+ :c:func:`PyUnicode_GET_LENGTH`.
+
+
+.. c:function:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o)
+
+ Return the size of the deprecated :c:type:`Py_UNICODE` representation in
+ bytes. *o* has to be a Unicode object (not checked).
+
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style Unicode API, please migrate to using
+ :c:func:`PyUnicode_GET_LENGTH`.
+
+
+.. c:function:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o)
+ const char* PyUnicode_AS_DATA(PyObject *o)
+
+ Return a pointer to a :c:type:`Py_UNICODE` representation of the object. The
+ ``AS_DATA`` form casts the pointer to :c:type:`const char *`. *o* has to be
+ a Unicode object (not checked).
+
+ .. versionchanged:: 3.3
+ This macro is now inefficient -- because in many cases the
+ :c:type:`Py_UNICODE` representation does not exist and needs to be created
+ -- and can fail (return *NULL* with an exception set). Try to port the
+ code to use the new :c:func:`PyUnicode_nBYTE_DATA` macros or use
+ :c:func:`PyUnicode_WRITE` or :c:func:`PyUnicode_READ`.
+
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style Unicode API, please migrate to using the
+ :c:func:`PyUnicode_nBYTE_DATA` family of macros.
+
+
Unicode Character Properties
""""""""""""""""""""""""""""
@@ -195,31 +347,66 @@ These APIs can be used for fast direct character conversions:
possible. This macro does not raise exceptions.
-Plain Py_UNICODE
-""""""""""""""""
+These APIs can be used to work with surrogates:
+
+.. c:macro:: Py_UNICODE_IS_SURROGATE(ch)
+
+ Check if *ch* is a surrogate (``0xD800 <= ch <= 0xDFFF``).
+
+.. c:macro:: Py_UNICODE_IS_HIGH_SURROGATE(ch)
+
+ Check if *ch* is an high surrogate (``0xD800 <= ch <= 0xDBFF``).
+
+.. c:macro:: Py_UNICODE_IS_LOW_SURROGATE(ch)
+
+ Check if *ch* is a low surrogate (``0xDC00 <= ch <= 0xDFFF``).
+
+.. c:macro:: Py_UNICODE_JOIN_SURROGATES(high, low)
+
+ Join two surrogate characters and return a single Py_UCS4 value.
+ *high* and *low* are respectively the leading and trailing surrogates in a
+ surrogate pair.
+
+
+Creating and accessing Unicode strings
+""""""""""""""""""""""""""""""""""""""
To create Unicode objects and access their basic sequence properties, use these
APIs:
+.. c:function:: PyObject* PyUnicode_New(Py_ssize_t size, Py_UCS4 maxchar)
-.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size)
+ Create a new Unicode object. *maxchar* should be the true maximum code point
+ to be placed in the string. As an approximation, it can be rounded up to the
+ nearest value in the sequence 127, 255, 65535, 1114111.
- Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u*
- may be *NULL* which causes the contents to be undefined. It is the user's
- responsibility to fill in the needed data. The buffer is copied into the new
- object. If the buffer is not *NULL*, the return value might be a shared object.
- Therefore, modification of the resulting Unicode object is only allowed when *u*
- is *NULL*.
+ This is the recommended way to allocate a new Unicode object. Objects
+ created using this function are not resizable.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyObject* PyUnicode_FromKindAndData(int kind, const void *buffer, \
+ Py_ssize_t size)
+
+ Create a new Unicode object with the given *kind* (possible values are
+ :c:macro:`PyUnicode_1BYTE_KIND` etc., as returned by
+ :c:func:`PyUnicode_KIND`). The *buffer* must point to an array of *size*
+ units of 1, 2 or 4 bytes per character, as given by the kind.
+
+ .. versionadded:: 3.3
.. c:function:: PyObject* PyUnicode_FromStringAndSize(const char *u, Py_ssize_t size)
- Create a Unicode object from the char buffer *u*. The bytes will be interpreted
- as being UTF-8 encoded. *u* may also be *NULL* which
- causes the contents to be undefined. It is the user's responsibility to fill in
- the needed data. The buffer is copied into the new object. If the buffer is not
- *NULL*, the return value might be a shared object. Therefore, modification of
- the resulting Unicode object is only allowed when *u* is *NULL*.
+ Create a Unicode object from the char buffer *u*. The bytes will be
+ interpreted as being UTF-8 encoded. The buffer is copied into the new
+ object. If the buffer is not *NULL*, the return value might be a shared
+ object, i.e. modification of the data is not allowed.
+
+ If *u* is *NULL*, this function behaves like :c:func:`PyUnicode_FromUnicode`
+ with the buffer set to *NULL*. This usage is deprecated in favor of
+ :c:func:`PyUnicode_New`.
.. c:function:: PyObject *PyUnicode_FromString(const char *u)
@@ -260,18 +447,27 @@ APIs:
| :attr:`%ld` | long | Exactly equivalent to |
| | | ``printf("%ld")``. |
+-------------------+---------------------+--------------------------------+
+ | :attr:`%li` | long | Exactly equivalent to |
+ | | | ``printf("%li")``. |
+ +-------------------+---------------------+--------------------------------+
| :attr:`%lu` | unsigned long | Exactly equivalent to |
| | | ``printf("%lu")``. |
+-------------------+---------------------+--------------------------------+
| :attr:`%lld` | long long | Exactly equivalent to |
| | | ``printf("%lld")``. |
+-------------------+---------------------+--------------------------------+
+ | :attr:`%lli` | long long | Exactly equivalent to |
+ | | | ``printf("%lli")``. |
+ +-------------------+---------------------+--------------------------------+
| :attr:`%llu` | unsigned long long | Exactly equivalent to |
| | | ``printf("%llu")``. |
+-------------------+---------------------+--------------------------------+
| :attr:`%zd` | Py_ssize_t | Exactly equivalent to |
| | | ``printf("%zd")``. |
+-------------------+---------------------+--------------------------------+
+ | :attr:`%zi` | Py_ssize_t | Exactly equivalent to |
+ | | | ``printf("%zi")``. |
+ +-------------------+---------------------+--------------------------------+
| :attr:`%zu` | size_t | Exactly equivalent to |
| | | ``printf("%zu")``. |
+-------------------+---------------------+--------------------------------+
@@ -322,24 +518,159 @@ APIs:
.. versionchanged:: 3.2
Support for ``"%lld"`` and ``"%llu"`` added.
+ .. versionchanged:: 3.3
+ Support for ``"%li"``, ``"%lli"`` and ``"%zi"`` added.
+
.. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs)
Identical to :c:func:`PyUnicode_FromFormat` except that it takes exactly two
arguments.
+
+.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, \
+ const char *encoding, const char *errors)
+
+ Coerce an encoded object *obj* to an Unicode object and return a reference with
+ incremented refcount.
+
+ :class:`bytes`, :class:`bytearray` and other char buffer compatible objects
+ are decoded according to the given *encoding* and using the error handling
+ defined by *errors*. Both can be *NULL* to have the interface use the default
+ values (see the next section for details).
+
+ All other objects, including Unicode objects, cause a :exc:`TypeError` to be
+ set.
+
+ The API returns *NULL* if there was an error. The caller is responsible for
+ decref'ing the returned objects.
+
+
+.. c:function:: Py_ssize_t PyUnicode_GetLength(PyObject *unicode)
+
+ Return the length of the Unicode object, in code points.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: int PyUnicode_CopyCharacters(PyObject *to, Py_ssize_t to_start, \
+ PyObject *to, Py_ssize_t from_start, Py_ssize_t how_many)
+
+ Copy characters from one Unicode object into another. This function performs
+ character conversion when necessary and falls back to :c:func:`memcpy` if
+ possible. Returns ``-1`` and sets an exception on error, otherwise returns
+ ``0``.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: int PyUnicode_WriteChar(PyObject *unicode, Py_ssize_t index, \
+ Py_UCS4 character)
+
+ Write a character to a string. The string must have been created through
+ :c:func:`PyUnicode_New`. Since Unicode strings are supposed to be immutable,
+ the string must not be shared, or have been hashed yet.
+
+ This function checks that *unicode* is a Unicode object, that the index is
+ not out of bounds, and that the object can be modified safely (i.e. that it
+ its reference count is one), in contrast to the macro version
+ :c:func:`PyUnicode_WRITE_CHAR`.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: Py_UCS4 PyUnicode_ReadChar(PyObject *unicode, Py_ssize_t index)
+
+ Read a character from a string. This function checks that *unicode* is a
+ Unicode object and the index is not out of bounds, in contrast to the macro
+ version :c:func:`PyUnicode_READ_CHAR`.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: PyObject* PyUnicode_Substring(PyObject *str, Py_ssize_t start, \
+ Py_ssize_t end)
+
+ Return a substring of *str*, from character index *start* (included) to
+ character index *end* (excluded). Negative indices are not supported.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: Py_UCS4* PyUnicode_AsUCS4(PyObject *u, Py_UCS4 *buffer, \
+ Py_ssize_t buflen, int copy_null)
+
+ Copy the string *u* into a UCS4 buffer, including a null character, if
+ *copy_null* is set. Returns *NULL* and sets an exception on error (in
+ particular, a :exc:`ValueError` if *buflen* is smaller than the length of
+ *u*). *buffer* is returned on success.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: Py_UCS4* PyUnicode_AsUCS4Copy(PyObject *u)
+
+ Copy the string *u* into a new UCS4 buffer that is allocated using
+ :c:func:`PyMem_Malloc`. If this fails, *NULL* is returned with a
+ :exc:`MemoryError` set.
+
+ .. versionadded:: 3.3
+
+
+Deprecated Py_UNICODE APIs
+""""""""""""""""""""""""""
+
+.. deprecated-removed:: 3.3 4.0
+
+These API functions are deprecated with the implementation of :pep:`393`.
+Extension modules can continue using them, as they will not be removed in Python
+3.x, but need to be aware that their use can now cause performance and memory hits.
+
+
+.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size)
+
+ Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u*
+ may be *NULL* which causes the contents to be undefined. It is the user's
+ responsibility to fill in the needed data. The buffer is copied into the new
+ object.
+
+ If the buffer is not *NULL*, the return value might be a shared object.
+ Therefore, modification of the resulting Unicode object is only allowed when
+ *u* is *NULL*.
+
+ If the buffer is *NULL*, :c:func:`PyUnicode_READY` must be called once the
+ string content has been filled before using any of the access macros such as
+ :c:func:`PyUnicode_KIND`.
+
+ Please migrate to using :c:func:`PyUnicode_FromKindAndData` or
+ :c:func:`PyUnicode_New`.
+
+
+.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
+
+ Return a read-only pointer to the Unicode object's internal
+ :c:type:`Py_UNICODE` buffer, *NULL* if *unicode* is not a Unicode object.
+ This will create the :c:type:`Py_UNICODE` representation of the object if it
+ is not yet available.
+
+ Please migrate to using :c:func:`PyUnicode_AsUCS4`,
+ :c:func:`PyUnicode_Substring`, :c:func:`PyUnicode_ReadChar` or similar new
+ APIs.
+
+
.. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size)
Create a Unicode object by replacing all decimal digits in
:c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9
- according to their decimal value. Return *NULL* if an exception
- occurs.
+ according to their decimal value. Return *NULL* if an exception occurs.
-.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode)
+.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size)
- Return a read-only pointer to the Unicode object's internal :c:type:`Py_UNICODE`
- buffer, *NULL* if *unicode* is not a Unicode object.
+ Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE`
+ array length in *size*.
+
+ .. versionadded:: 3.3
.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeCopy(PyObject *unicode)
@@ -351,27 +682,15 @@ APIs:
.. versionadded:: 3.2
+ Please migrate to using :c:func:`PyUnicode_AsUCS4Copy` or similar new APIs.
-.. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
-
- Return the length of the Unicode object.
-
-
-.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, const char *encoding, const char *errors)
-
- Coerce an encoded object *obj* to an Unicode object and return a reference with
- incremented refcount.
- :class:`bytes`, :class:`bytearray` and other char buffer compatible objects
- are decoded according to the given *encoding* and using the error handling
- defined by *errors*. Both can be *NULL* to have the interface use the default
- values (see the next section for details).
+.. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode)
- All other objects, including Unicode objects, cause a :exc:`TypeError` to be
- set.
+ Return the size of the deprecated :c:type:`Py_UNICODE` representation, in
+ code units (this includes surrogate pairs as 2 units).
- The API returns *NULL* if there was an error. The caller is responsible for
- decref'ing the returned objects.
+ Please migrate to using :c:func:`PyUnicode_GetLength`.
.. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj)
@@ -379,11 +698,6 @@ APIs:
Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used
throughout the interpreter whenever coercion to Unicode is needed.
-If the platform supports :c:type:`wchar_t` and provides a header file wchar.h,
-Python can interface directly to this type using the following functions.
-Support is optimized if Python's own :c:type:`Py_UNICODE` type is identical to
-the system's :c:type:`wchar_t`.
-
File System Encoding
""""""""""""""""""""
@@ -493,6 +807,26 @@ wchar_t Support
.. versionadded:: 3.2
+UCS4 Support
+""""""""""""
+
+.. versionadded:: 3.3
+
+.. XXX are these meant to be public?
+
+.. c:function:: size_t Py_UCS4_strlen(const Py_UCS4 *u)
+ Py_UCS4* Py_UCS4_strcpy(Py_UCS4 *s1, const Py_UCS4 *s2)
+ Py_UCS4* Py_UCS4_strncpy(Py_UCS4 *s1, const Py_UCS4 *s2, size_t n)
+ Py_UCS4* Py_UCS4_strcat(Py_UCS4 *s1, const Py_UCS4 *s2)
+ int Py_UCS4_strcmp(const Py_UCS4 *s1, const Py_UCS4 *s2)
+ int Py_UCS4_strncmp(const Py_UCS4 *s1, const Py_UCS4 *s2, size_t n)
+ Py_UCS4* Py_UCS4_strchr(const Py_UCS4 *s, Py_UCS4 c)
+ Py_UCS4* Py_UCS4_strrchr(const Py_UCS4 *s, Py_UCS4 c)
+
+ These utility functions work on strings of :c:type:`Py_UCS4` characters and
+ otherwise behave like the C standard library functions with the same name.
+
+
.. _builtincodecs:
Built-in Codecs
@@ -527,7 +861,8 @@ Generic Codecs
These are the generic codec APIs:
-.. c:function:: PyObject* PyUnicode_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
+.. c:function:: PyObject* PyUnicode_Decode(const char *s, Py_ssize_t size, \
+ const char *encoding, const char *errors)
Create a Unicode object by decoding *size* bytes of the encoded string *s*.
*encoding* and *errors* have the same meaning as the parameters of the same name
@@ -536,7 +871,18 @@ These are the generic codec APIs:
the codec.
-.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors)
+.. c:function:: PyObject* PyUnicode_AsEncodedString(PyObject *unicode, \
+ const char *encoding, const char *errors)
+
+ Encode a Unicode object and return the result as Python bytes object.
+ *encoding* and *errors* have the same meaning as the parameters of the same
+ name in the Unicode :meth:`encode` method. The codec to be used is looked up
+ using the Python codec registry. Return *NULL* if an exception was raised by
+ the codec.
+
+
+.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, \
+ const char *encoding, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* and return a Python
bytes object. *encoding* and *errors* have the same meaning as the
@@ -544,14 +890,9 @@ These are the generic codec APIs:
to be used is looked up using the Python codec registry. Return *NULL* if an
exception was raised by the codec.
-
-.. c:function:: PyObject* PyUnicode_AsEncodedString(PyObject *unicode, const char *encoding, const char *errors)
-
- Encode a Unicode object and return the result as Python bytes object.
- *encoding* and *errors* have the same meaning as the parameters of the same
- name in the Unicode :meth:`encode` method. The codec to be used is looked up
- using the Python codec registry. Return *NULL* if an exception was raised by
- the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsEncodedString`.
UTF-8 Codecs
@@ -566,7 +907,8 @@ These are the UTF-8 codec APIs:
*s*. Return *NULL* if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_DecodeUTF8Stateful(const char *s, Py_ssize_t size, const char *errors, Py_ssize_t *consumed)
+.. c:function:: PyObject* PyUnicode_DecodeUTF8Stateful(const char *s, Py_ssize_t size, \
+ const char *errors, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF8`. If
*consumed* is not *NULL*, trailing incomplete UTF-8 byte sequences will not be
@@ -574,18 +916,45 @@ These are the UTF-8 codec APIs:
that have been decoded will be stored in *consumed*.
+.. c:function:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode)
+
+ Encode a Unicode object using UTF-8 and return the result as Python bytes
+ object. Error handling is "strict". Return *NULL* if an exception was
+ raised by the codec.
+
+
+.. c:function:: char* PyUnicode_AsUTF8AndSize(PyObject *unicode, Py_ssize_t *size)
+
+ Return a pointer to the default encoding (UTF-8) of the Unicode object, and
+ store the size of the encoded representation (in bytes) in *size*. *size*
+ can be *NULL*, in this case no size will be stored.
+
+ In the case of an error, *NULL* is returned with an exception set and no
+ *size* is stored.
+
+ This caches the UTF-8 representation of the string in the Unicode object, and
+ subsequent calls will return a pointer to the same buffer. The caller is not
+ responsible for deallocating the buffer.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: char* PyUnicode_AsUTF8(PyObject *unicode)
+
+ As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
+
+ .. versionadded:: 3.3
+
+
.. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and
return a Python bytes object. Return *NULL* if an exception was raised by
the codec.
-
-.. c:function:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode)
-
- Encode a Unicode object using UTF-8 and return the result as Python bytes
- object. Error handling is "strict". Return *NULL* if an exception was
- raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsUTF8String` or :c:func:`PyUnicode_AsUTF8AndSize`.
UTF-32 Codecs
@@ -594,7 +963,8 @@ UTF-32 Codecs
These are the UTF-32 codec APIs:
-.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
+.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, \
+ const char *errors, int *byteorder)
Decode *size* bytes from a UTF-32 encoded buffer string and return the
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
@@ -622,7 +992,8 @@ These are the UTF-32 codec APIs:
Return *NULL* if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_DecodeUTF32Stateful(const char *s, Py_ssize_t size, const char *errors, int *byteorder, Py_ssize_t *consumed)
+.. c:function:: PyObject* PyUnicode_DecodeUTF32Stateful(const char *s, Py_ssize_t size, \
+ const char *errors, int *byteorder, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF32`. If
*consumed* is not *NULL*, :c:func:`PyUnicode_DecodeUTF32Stateful` will not treat
@@ -631,7 +1002,15 @@ These are the UTF-32 codec APIs:
that have been decoded will be stored in *consumed*.
-.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
+.. c:function:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode)
+
+ Return a Python byte string using the UTF-32 encoding in native byte
+ order. The string always starts with a BOM mark. Error handling is "strict".
+ Return *NULL* if an exception was raised by the codec.
+
+
+.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, \
+ const char *errors, int byteorder)
Return a Python bytes object holding the UTF-32 encoded value of the Unicode
data in *s*. Output is written according to the following byte order::
@@ -648,12 +1027,9 @@ These are the UTF-32 codec APIs:
Return *NULL* if an exception was raised by the codec.
-
-.. c:function:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode)
-
- Return a Python byte string using the UTF-32 encoding in native byte
- order. The string always starts with a BOM mark. Error handling is "strict".
- Return *NULL* if an exception was raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsUTF32String`.
UTF-16 Codecs
@@ -662,7 +1038,8 @@ UTF-16 Codecs
These are the UTF-16 codec APIs:
-.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
+.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, \
+ const char *errors, int *byteorder)
Decode *size* bytes from a UTF-16 encoded buffer string and return the
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
@@ -689,7 +1066,8 @@ These are the UTF-16 codec APIs:
Return *NULL* if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_DecodeUTF16Stateful(const char *s, Py_ssize_t size, const char *errors, int *byteorder, Py_ssize_t *consumed)
+.. c:function:: PyObject* PyUnicode_DecodeUTF16Stateful(const char *s, Py_ssize_t size, \
+ const char *errors, int *byteorder, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF16`. If
*consumed* is not *NULL*, :c:func:`PyUnicode_DecodeUTF16Stateful` will not treat
@@ -698,7 +1076,15 @@ These are the UTF-16 codec APIs:
number of bytes that have been decoded will be stored in *consumed*.
-.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
+.. c:function:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode)
+
+ Return a Python byte string using the UTF-16 encoding in native byte
+ order. The string always starts with a BOM mark. Error handling is "strict".
+ Return *NULL* if an exception was raised by the codec.
+
+
+.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, \
+ const char *errors, int byteorder)
Return a Python bytes object holding the UTF-16 encoded value of the Unicode
data in *s*. Output is written according to the following byte order::
@@ -716,12 +1102,9 @@ These are the UTF-16 codec APIs:
Return *NULL* if an exception was raised by the codec.
-
-.. c:function:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode)
-
- Return a Python byte string using the UTF-16 encoding in native byte
- order. The string always starts with a BOM mark. Error handling is "strict".
- Return *NULL* if an exception was raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsUTF16String`.
UTF-7 Codecs
@@ -736,7 +1119,8 @@ These are the UTF-7 codec APIs:
*s*. Return *NULL* if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_DecodeUTF7Stateful(const char *s, Py_ssize_t size, const char *errors, Py_ssize_t *consumed)
+.. c:function:: PyObject* PyUnicode_DecodeUTF7Stateful(const char *s, Py_ssize_t size, \
+ const char *errors, Py_ssize_t *consumed)
If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF7`. If
*consumed* is not *NULL*, trailing incomplete UTF-7 base-64 sections will not
@@ -744,7 +1128,8 @@ These are the UTF-7 codec APIs:
bytes that have been decoded will be stored in *consumed*.
-.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE *s, Py_ssize_t size, int base64SetO, int base64WhiteSpace, const char *errors)
+.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE *s, Py_ssize_t size, \
+ int base64SetO, int base64WhiteSpace, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-7 and
return a Python bytes object. Return *NULL* if an exception was raised by
@@ -755,6 +1140,11 @@ These are the UTF-7 codec APIs:
nonzero, whitespace will be encoded in base-64. Both are set to zero for the
Python "utf-7" codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API.
+
+ .. XXX replace with what?
+
Unicode-Escape Codecs
"""""""""""""""""""""
@@ -762,24 +1152,29 @@ Unicode-Escape Codecs
These are the "Unicode Escape" codec APIs:
-.. c:function:: PyObject* PyUnicode_DecodeUnicodeEscape(const char *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_DecodeUnicodeEscape(const char *s, \
+ Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the Unicode-Escape encoded
string *s*. Return *NULL* if an exception was raised by the codec.
+.. c:function:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
+
+ Encode a Unicode object using Unicode-Escape and return the result as Python
+ string object. Error handling is "strict". Return *NULL* if an exception was
+ raised by the codec.
+
+
.. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and
return a Python string object. Return *NULL* if an exception was raised by the
codec.
-
-.. c:function:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode)
-
- Encode a Unicode object using Unicode-Escape and return the result as Python
- string object. Error handling is "strict". Return *NULL* if an exception was
- raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsUnicodeEscapeString`.
Raw-Unicode-Escape Codecs
@@ -788,19 +1183,13 @@ Raw-Unicode-Escape Codecs
These are the "Raw Unicode Escape" codec APIs:
-.. c:function:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char *s, Py_ssize_t size, const char *errors)
+.. c:function:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char *s, \
+ Py_ssize_t size, const char *errors)
Create a Unicode object by decoding *size* bytes of the Raw-Unicode-Escape
encoded string *s*. Return *NULL* if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
- and return a Python string object. Return *NULL* if an exception was raised by
- the codec.
-
-
.. c:function:: PyObject* PyUnicode_AsRawUnicodeEscapeString(PyObject *unicode)
Encode a Unicode object using Raw-Unicode-Escape and return the result as
@@ -808,6 +1197,18 @@ These are the "Raw Unicode Escape" codec APIs:
was raised by the codec.
+.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, \
+ Py_ssize_t size, const char *errors)
+
+ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
+ and return a Python string object. Return *NULL* if an exception was raised by
+ the codec.
+
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsRawUnicodeEscapeString`.
+
+
Latin-1 Codecs
""""""""""""""
@@ -821,18 +1222,22 @@ ordinals and only these are accepted by the codecs during encoding.
*s*. Return *NULL* if an exception was raised by the codec.
+.. c:function:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode)
+
+ Encode a Unicode object using Latin-1 and return the result as Python bytes
+ object. Error handling is "strict". Return *NULL* if an exception was
+ raised by the codec.
+
+
.. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Latin-1 and
return a Python bytes object. Return *NULL* if an exception was raised by
the codec.
-
-.. c:function:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode)
-
- Encode a Unicode object using Latin-1 and return the result as Python bytes
- object. Error handling is "strict". Return *NULL* if an exception was
- raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsLatin1String`.
ASCII Codecs
@@ -848,18 +1253,22 @@ codes generate errors.
*s*. Return *NULL* if an exception was raised by the codec.
+.. c:function:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode)
+
+ Encode a Unicode object using ASCII and return the result as Python bytes
+ object. Error handling is "strict". Return *NULL* if an exception was
+ raised by the codec.
+
+
.. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using ASCII and
return a Python bytes object. Return *NULL* if an exception was raised by
the codec.
-
-.. c:function:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode)
-
- Encode a Unicode object using ASCII and return the result as Python bytes
- object. Error handling is "strict". Return *NULL* if an exception was
- raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsASCIIString`.
Character Map Codecs
@@ -888,7 +1297,8 @@ characters to different code points.
These are the mapping codec APIs:
-.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors)
+.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, \
+ PyObject *mapping, const char *errors)
Create a Unicode object by decoding *size* bytes of the encoded string *s* using
the given *mapping* object. Return *NULL* if an exception was raised by the
@@ -898,13 +1308,6 @@ These are the mapping codec APIs:
treated as "undefined mapping".
-.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors)
-
- Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given
- *mapping* object and return a Python string object. Return *NULL* if an
- exception was raised by the codec.
-
-
.. c:function:: PyObject* PyUnicode_AsCharmapString(PyObject *unicode, PyObject *mapping)
Encode a Unicode object using the given *mapping* object and return the result
@@ -914,7 +1317,8 @@ These are the mapping codec APIs:
The following codec API is special in that maps Unicode to Unicode.
-.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors)
+.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \
+ PyObject *table, const char *errors)
Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a
character mapping *table* to it and return the resulting Unicode object. Return
@@ -927,6 +1331,22 @@ The following codec API is special in that maps Unicode to Unicode.
and sequences work well. Unmapped character ordinals (ones which cause a
:exc:`LookupError`) are left untouched and are copied as-is.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API.
+
+ .. XXX replace with what?
+
+
+.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \
+ PyObject *mapping, const char *errors)
+
+ Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given
+ *mapping* object and return a Python string object. Return *NULL* if an
+ exception was raised by the codec.
+
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsCharmapString`.
MBCS codecs for Windows
@@ -943,7 +1363,8 @@ the user settings on the machine running the codec.
Return *NULL* if an exception was raised by the codec.
-.. c:function:: PyObject* PyUnicode_DecodeMBCSStateful(const char *s, int size, const char *errors, int *consumed)
+.. c:function:: PyObject* PyUnicode_DecodeMBCSStateful(const char *s, int size, \
+ const char *errors, int *consumed)
If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeMBCS`. If
*consumed* is not *NULL*, :c:func:`PyUnicode_DecodeMBCSStateful` will not decode
@@ -951,18 +1372,22 @@ the user settings on the machine running the codec.
in *consumed*.
+.. c:function:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode)
+
+ Encode a Unicode object using MBCS and return the result as Python bytes
+ object. Error handling is "strict". Return *NULL* if an exception was
+ raised by the codec.
+
+
.. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using MBCS and return
a Python bytes object. Return *NULL* if an exception was raised by the
codec.
-
-.. c:function:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode)
-
- Encode a Unicode object using MBCS and return the result as Python bytes
- object. Error handling is "strict". Return *NULL* if an exception was
- raised by the codec.
+ .. deprecated-removed:: 3.3 4.0
+ Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using
+ :c:func:`PyUnicode_AsMBCSString`.
Methods & Slots
@@ -1001,7 +1426,8 @@ They all return *NULL* or ``-1`` if an exception occurs.
characters are not included in the resulting strings.
-.. c:function:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors)
+.. c:function:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, \
+ const char *errors)
Translate a string by applying a character mapping table to it and return the
resulting Unicode object.
@@ -1023,14 +1449,16 @@ They all return *NULL* or ``-1`` if an exception occurs.
Unicode string.
-.. c:function:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
+.. c:function:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, \
+ Py_ssize_t start, Py_ssize_t end, int direction)
Return 1 if *substr* matches ``str[start:end]`` at the given tail end
(*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match),
0 otherwise. Return ``-1`` if an error occurred.
-.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
+.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, \
+ Py_ssize_t start, Py_ssize_t end, int direction)
Return the first position of *substr* in ``str[start:end]`` using the given
*direction* (*direction* == 1 means to do a forward search, *direction* == -1 a
@@ -1039,13 +1467,27 @@ They all return *NULL* or ``-1`` if an exception occurs.
occurred and an exception has been set.
-.. c:function:: Py_ssize_t PyUnicode_Count(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end)
+.. c:function:: Py_ssize_t PyUnicode_FindChar(PyObject *str, Py_UCS4 ch, \
+ Py_ssize_t start, Py_ssize_t end, int direction)
+
+ Return the first position of the character *ch* in ``str[start:end]`` using
+ the given *direction* (*direction* == 1 means to do a forward search,
+ *direction* == -1 a backward search). The return value is the index of the
+ first match; a value of ``-1`` indicates that no match was found, and ``-2``
+ indicates that an error occurred and an exception has been set.
+
+ .. versionadded:: 3.3
+
+
+.. c:function:: Py_ssize_t PyUnicode_Count(PyObject *str, PyObject *substr, \
+ Py_ssize_t start, Py_ssize_t end)
Return the number of non-overlapping occurrences of *substr* in
``str[start:end]``. Return ``-1`` if an error occurred.
-.. c:function:: PyObject* PyUnicode_Replace(PyObject *str, PyObject *substr, PyObject *replstr, Py_ssize_t maxcount)
+.. c:function:: PyObject* PyUnicode_Replace(PyObject *str, PyObject *substr, \
+ PyObject *replstr, Py_ssize_t maxcount)
Replace at most *maxcount* occurrences of *substr* in *str* with *replstr* and
return the resulting Unicode object. *maxcount* == -1 means replace all
@@ -1093,8 +1535,8 @@ They all return *NULL* or ``-1`` if an exception occurs.
Check whether *element* is contained in *container* and return true or false
accordingly.
- *element* has to coerce to a one element Unicode string. ``-1`` is returned if
- there was an error.
+ *element* has to coerce to a one element Unicode string. ``-1`` is returned
+ if there was an error.
.. c:function:: void PyUnicode_InternInPlace(PyObject **string)
@@ -1113,7 +1555,6 @@ They all return *NULL* or ``-1`` if an exception occurs.
.. c:function:: PyObject* PyUnicode_InternFromString(const char *v)
A combination of :c:func:`PyUnicode_FromString` and
- :c:func:`PyUnicode_InternInPlace`, returning either a new unicode string object
- that has been interned, or a new ("owned") reference to an earlier interned
- string object with the same value.
-
+ :c:func:`PyUnicode_InternInPlace`, returning either a new unicode string
+ object that has been interned, or a new ("owned") reference to an earlier
+ interned string object with the same value.