summaryrefslogtreecommitdiffstats
path: root/Doc/c-api
diff options
context:
space:
mode:
authorSerhiy Storchaka <storchaka@gmail.com>2023-05-21 21:32:39 (GMT)
committerGitHub <noreply@github.com>2023-05-21 21:32:39 (GMT)
commitf3466bc04008660c4a5c3ed6f70144f138ae2e7f (patch)
tree3aada373c1a064f47e8273f30439c5fcea1d7e3a /Doc/c-api
parent6ba8406cb6e656e47e908f8c7354e07ed0f2d774 (diff)
downloadcpython-f3466bc04008660c4a5c3ed6f70144f138ae2e7f.zip
cpython-f3466bc04008660c4a5c3ed6f70144f138ae2e7f.tar.gz
cpython-f3466bc04008660c4a5c3ed6f70144f138ae2e7f.tar.bz2
gh-98836: Extend PyUnicode_FromFormat() (GH-98838)
* Support for conversion specifiers o (octal) and X (uppercase hexadecimal). * Support for length modifiers j (intmax_t) and t (ptrdiff_t). * Length modifiers are now applied to all integer conversions. * Support for wchar_t C strings (%ls and %lV). * Support for variable width and precision (*). * Support for flag - (left alignment).
Diffstat (limited to 'Doc/c-api')
-rw-r--r--Doc/c-api/unicode.rst228
1 files changed, 143 insertions, 85 deletions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index ab3a2e2..6771f37 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -394,98 +394,149 @@ APIs:
arguments, calculate the size of the resulting Python Unicode string and return
a string with the values formatted into it. The variable arguments must be C
types and must correspond exactly to the format characters in the *format*
- ASCII-encoded string. The following format characters are allowed:
-
- .. % This should be exactly the same as the table in PyErr_Format.
-
- .. tabularcolumns:: |l|l|L|
-
- +-------------------+---------------------+----------------------------------+
- | Format Characters | Type | Comment |
- +===================+=====================+==================================+
- | :attr:`%%` | *n/a* | The literal % character. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%c` | int | A single character, |
- | | | represented as a C int. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%d` | int | Equivalent to |
- | | | ``printf("%d")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%u` | unsigned int | Equivalent to |
- | | | ``printf("%u")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%ld` | long | Equivalent to |
- | | | ``printf("%ld")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%li` | long | Equivalent to |
- | | | ``printf("%li")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%lu` | unsigned long | Equivalent to |
- | | | ``printf("%lu")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%lld` | long long | Equivalent to |
- | | | ``printf("%lld")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%lli` | long long | Equivalent to |
- | | | ``printf("%lli")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%llu` | unsigned long long | Equivalent to |
- | | | ``printf("%llu")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%zd` | :c:type:`\ | Equivalent to |
- | | Py_ssize_t` | ``printf("%zd")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%zi` | :c:type:`\ | Equivalent to |
- | | Py_ssize_t` | ``printf("%zi")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%zu` | size_t | Equivalent to |
- | | | ``printf("%zu")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%i` | int | Equivalent to |
- | | | ``printf("%i")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%x` | int | Equivalent to |
- | | | ``printf("%x")``. [1]_ |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%s` | const char\* | A null-terminated C character |
- | | | array. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%p` | const void\* | The hex representation of a C |
- | | | pointer. Mostly equivalent to |
- | | | ``printf("%p")`` except that |
- | | | it is guaranteed to start with |
- | | | the literal ``0x`` regardless |
- | | | of what the platform's |
- | | | ``printf`` yields. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%A` | PyObject\* | The result of calling |
- | | | :func:`ascii`. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%U` | PyObject\* | A Unicode object. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%V` | PyObject\*, | A Unicode object (which may be |
- | | const char\* | ``NULL``) and a null-terminated |
- | | | C character array as a second |
- | | | parameter (which will be used, |
- | | | if the first parameter is |
- | | | ``NULL``). |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%S` | PyObject\* | The result of calling |
- | | | :c:func:`PyObject_Str`. |
- +-------------------+---------------------+----------------------------------+
- | :attr:`%R` | PyObject\* | The result of calling |
- | | | :c:func:`PyObject_Repr`. |
- +-------------------+---------------------+----------------------------------+
+ ASCII-encoded string.
+
+ A conversion specifier contains two or more characters and has the following
+ components, which must occur in this order:
+
+ #. The ``'%'`` character, which marks the start of the specifier.
+
+ #. Conversion flags (optional), which affect the result of some conversion
+ types.
+
+ #. Minimum field width (optional).
+ If specified as an ``'*'`` (asterisk), the actual width is given in the
+ next argument, which must be of type :c:expr:`int`, and the object to
+ convert comes after the minimum field width and optional precision.
+
+ #. Precision (optional), given as a ``'.'`` (dot) followed by the precision.
+ If specified as ``'*'`` (an asterisk), the actual precision is given in
+ the next argument, which must be of type :c:expr:`int`, and the value to
+ convert comes after the precision.
+
+ #. Length modifier (optional).
+
+ #. Conversion type.
+
+ The conversion flag characters are:
+
+ .. tabularcolumns:: |l|L|
+
+ +-------+-------------------------------------------------------------+
+ | Flag | Meaning |
+ +=======+=============================================================+
+ | ``0`` | The conversion will be zero padded for numeric values. |
+ +-------+-------------------------------------------------------------+
+ | ``-`` | The converted value is left adjusted (overrides the ``0`` |
+ | | flag if both are given). |
+ +-------+-------------------------------------------------------------+
+
+ The length modifiers for following integer conversions (``d``, ``i``,
+ ``o``, ``u``, ``x``, or ``X``) specify the type of the argument
+ (:c:expr:`int` by default):
+
+ .. tabularcolumns:: |l|L|
+
+ +----------+-----------------------------------------------------+
+ | Modifier | Types |
+ +==========+=====================================================+
+ | ``l`` | :c:expr:`long` or :c:expr:`unsigned long` |
+ +----------+-----------------------------------------------------+
+ | ``ll`` | :c:expr:`long long` or :c:expr:`unsigned long long` |
+ +----------+-----------------------------------------------------+
+ | ``j`` | :c:expr:`intmax_t` or :c:expr:`uintmax_t` |
+ +----------+-----------------------------------------------------+
+ | ``z`` | :c:expr:`size_t` or :c:expr:`ssize_t` |
+ +----------+-----------------------------------------------------+
+ | ``t`` | :c:expr:`ptrdiff_t` |
+ +----------+-----------------------------------------------------+
+
+ The length modifier ``l`` for following conversions ``s`` or ``V`` specify
+ that the type of the argument is :c:expr:`const wchar_t*`.
+
+ The conversion specifiers are:
+
+ .. list-table::
+ :widths: auto
+ :header-rows: 1
+
+ * - Conversion Specifier
+ - Type
+ - Comment
+
+ * - ``%``
+ - *n/a*
+ - The literal ``%`` character.
+
+ * - ``d``, ``i``
+ - Specified by the length modifier
+ - The decimal representation of a signed C integer.
+
+ * - ``u``
+ - Specified by the length modifier
+ - The decimal representation of an unsigned C integer.
+
+ * - ``o``
+ - Specified by the length modifier
+ - The octal representation of an unsigned C integer.
+
+ * - ``x``
+ - Specified by the length modifier
+ - The hexadecimal representation of an unsigned C integer (lowercase).
+
+ * - ``X``
+ - Specified by the length modifier
+ - The hexadecimal representation of an unsigned C integer (uppercase).
+
+ * - ``c``
+ - :c:expr:`int`
+ - A single character.
+
+ * - ``s``
+ - :c:expr:`const char*` or :c:expr:`const wchar_t*`
+ - A null-terminated C character array.
+
+ * - ``p``
+ - :c:expr:`const void*`
+ - The hex representation of a C pointer.
+ Mostly equivalent to ``printf("%p")`` except that it is guaranteed to
+ start with the literal ``0x`` regardless of what the platform's
+ ``printf`` yields.
+
+ * - ``A``
+ - :c:expr:`PyObject*`
+ - The result of calling :func:`ascii`.
+
+ * - ``U``
+ - :c:expr:`PyObject*`
+ - A Unicode object.
+
+ * - ``V``
+ - :c:expr:`PyObject*`, :c:expr:`const char*` or :c:expr:`const wchar_t*`
+ - A Unicode object (which may be ``NULL``) and a null-terminated
+ C character array as a second parameter (which will be used,
+ if the first parameter is ``NULL``).
+
+ * - ``S``
+ - :c:expr:`PyObject*`
+ - The result of calling :c:func:`PyObject_Str`.
+
+ * - ``R``
+ - :c:expr:`PyObject*`
+ - The result of calling :c:func:`PyObject_Repr`.
.. note::
The width formatter unit is number of characters rather than bytes.
- The precision formatter unit is number of bytes for ``"%s"`` and
+ The precision formatter unit is number of bytes or :c:expr:`wchar_t`
+ items (if the length modifier ``l`` is used) for ``"%s"`` and
``"%V"`` (if the ``PyObject*`` argument is ``NULL``), and a number of
characters for ``"%A"``, ``"%U"``, ``"%S"``, ``"%R"`` and ``"%V"``
(if the ``PyObject*`` argument is not ``NULL``).
- .. [1] For integer specifiers (d, u, ld, li, lu, lld, lli, llu, zd, zi,
- zu, i, x): the 0-conversion flag has effect even when a precision is given.
+ .. note::
+ Unlike to C :c:func:`printf` the ``0`` flag has effect even when
+ a precision is given for integer conversions (``d``, ``i``, ``u``, ``o``,
+ ``x``, or ``X``).
.. versionchanged:: 3.2
Support for ``"%lld"`` and ``"%llu"`` added.
@@ -498,6 +549,13 @@ APIs:
``"%V"``, ``"%S"``, ``"%R"`` added.
.. versionchanged:: 3.12
+ Support for conversion specifiers ``o`` and ``X``.
+ Support for length modifiers ``j`` and ``t``.
+ Length modifiers are now applied to all integer conversions.
+ Length modifier ``l`` is now applied to conversion specifiers ``s`` and ``V``.
+ Support for variable width and precision ``*``.
+ Support for flag ``-``.
+
An unrecognized format character now sets a :exc:`SystemError`.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.