summaryrefslogtreecommitdiffstats
path: root/Doc/c-api/unicode.rst
Commit message (Collapse)AuthorAgeFilesLines
* [3.13] Docs C API: Clarify what happens when null bytes are passed to ↵Miss Islington (bot)2025-01-201-0/+9
| | | | | | | | | | | `PyUnicode_AsUTF8` (GH-127458) (#129080) Docs C API: Clarify what happens when null bytes are passed to `PyUnicode_AsUTF8` (GH-127458) (cherry picked from commit e792f4bc2e712bb6e2143599d2b88dd339de83e6) Co-authored-by: Peter Bierma <zintensitydev@gmail.com> Co-authored-by: Stan U. <89152624+StanFromIreland@users.noreply.github.com> Co-authored-by: Tomas R. <tomas.roun8@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org>
* [3.13] gh-90241: Clarify documentation for PyUnicode_FSConverter and ↵Miss Islington (bot)2025-01-061-8/+27
| | | | | | | | | PyUnicode_FSDecoder (GH-128451) (GH-128542) (cherry picked from commit 657d7b77e5c69967e9c0000b986fa4872d13958c) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com> Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
* [3.13] doc: PyUnicode_AsUTF8String() fails if string contains surrogates ↵Miss Islington (bot)2024-09-271-3/+10
| | | | | | | | (GH-124605) (#124707) doc: PyUnicode_AsUTF8String() fails if string contains surrogates (GH-124605) (cherry picked from commit d8cf587dc749cf21eafc1064237970ee7460634f) Co-authored-by: Victor Stinner <vstinner@python.org>
* [3.13] GH-95079: document error behaviour for some unicode C APIs (GH-95080) ↵Miss Islington (bot)2024-09-271-0/+9
| | | | | | | | (#124661) GH-95079: document error behaviour for some unicode C APIs (GH-95080) (cherry picked from commit b79a21ea429844e84509430e636d808ea9cff244) Co-authored-by: Max Bachmann <kontakt@maxbachmann.de>
* [3.13] gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep ↵Petr Viktorin2024-07-171-6/+32
| | | | | | | | | | | | | | | | | immortalizing in other API (GH-121364) (GH-121854) * Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs * Document immortality in some functions that take `const char *` This is PyUnicode_InternFromString; PyDict_SetItemString, PyObject_SetAttrString; PyObject_DelAttrString; PyUnicode_InternFromString; and the PyModule_Add convenience functions. Always point out a non-immortalizing alternative. * Don't immortalize user-provided attr names in _ctypes (cherry picked from commit b4aedb23ae7954fb58084dda16cd41786819a8cf)
* gh-117642: Fix PEP 737 implementation (GH-117643)Serhiy Storchaka2024-04-081-3/+3
| | | | | * Fix implementation of %#T and %#N (they were implemented as %T# and %N#). * Restore tests removed in gh-116417.
* gh-111696, PEP 737: Add %T and %N to PyUnicode_FromFormat() (#116839)Victor Stinner2024-03-141-0/+23
|
* gh-113437: Update documentation about PyUnicode_AsWideChar() function ↵qqwqqw6892024-02-131-1/+6
| | | | (GH-113455)
* gh-62897: Update PyUnicode C API parameter names (GH-12680)Rune Tynan2023-12-051-91/+91
| | | | | Standardize PyUnicode C API parameter names across the documentation. Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* gh-111089: Revert PyUnicode_AsUTF8() changes (#111833)Victor Stinner2023-11-071-8/+0
| | | | | | | | | | | | | | | | | | | | | * Revert "gh-111089: Use PyUnicode_AsUTF8() in Argument Clinic (#111585)" This reverts commit d9b606b3d04fc56fb0bcc479d7d6c14562edb5e2. * Revert "gh-111089: Use PyUnicode_AsUTF8() in getargs.c (#111620)" This reverts commit cde1071b2a72e8261ca66053ef61431b7f3a81fd. * Revert "gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)" This reverts commit d731579bfb9a497cfb0076cb6b221058a20088fe. * Revert "gh-111089: Add PyUnicode_AsUTF8() to the limited C API (#111121)" This reverts commit d8f32be5b6a736dc2fc9dca3f1bf176c82fc9b44. * Revert "gh-111089: Use PyUnicode_AsUTF8() in sqlite3 (#111122)" This reverts commit 37e4e20eaa8f27ada926d49e5971fecf0477ad26.
* gh-111089: PyUnicode_AsUTF8AndSize() sets size on error (#111106)Victor Stinner2023-10-201-2/+2
| | | | On error, PyUnicode_AsUTF8AndSize() now sets the size argument to -1, to avoid undefined value.
* gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)Victor Stinner2023-10-201-0/+8
| | | | | | | | | * PyUnicode_AsUTF8() now raises an exception if the string contains embedded null characters. * Update related C API tests (test_capi.test_unicode). * type_new_set_doc() uses PyUnicode_AsUTF8AndSize() to silently truncate doc containing null bytes. Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* gh-110289: C API: Add PyUnicode_EqualToUTF8() and ↵Serhiy Storchaka2023-10-111-0/+22
| | | | PyUnicode_EqualToUTF8AndSize() functions (GH-110297)
* gh-107298: Fix numerous ref errors and typos in the C API docs (GH-108258)Serhiy Storchaka2023-08-221-1/+1
|
* gh-98154: Clarify Usage of "Reference Count" In the Docs (gh-107552)Eric Snow2023-08-071-6/+6
| | | | | PEP 683 (immortal objects) revealed some ways in which the Python documentation has been unnecessarily coupled to the implementation details of reference counts. In the end users should focus on reference ownership, including taking references and releasing them, rather than on how many reference counts an object has. This change updates the documentation to reflect that perspective. It also updates the docs relative to immortal objects in a handful of places.
* gh-107298: Fix more Sphinx warnings in the C API doc (#107329)Victor Stinner2023-07-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Declare the following functions as macros, since they are actually macros. It avoids a warning on "TYPE" or "macro" argument. * PyMem_New() * PyMem_Resize() * PyModule_AddIntMacro() * PyModule_AddStringMacro() * PyObject_GC_New() * PyObject_GC_NewVar() * PyObject_New() * PyObject_NewVar() Add C standard C types to nitpick_ignore in Doc/conf.py: * int64_t * uint64_t * uintptr_t No longer ignore non existing "__int" type in nitpick_ignore. Update Doc/tools/.nitignore
* gh-107298: Fix Sphinx warnings in the C API doc (#107302)Victor Stinner2023-07-261-2/+2
| | | | * Update Doc/tools/.nitignore * Fix BufferedIOBase.write() link in buffer.rst
* gh-107298: Fix doc references to undocumented modules (#107300)Victor Stinner2023-07-261-1/+1
| | | Update also Doc/tools/.nitignore.
* gh-106948: Add standard external names to nitpick_ignore (GH-106949)Serhiy Storchaka2023-07-221-11/+11
| | | | | It includes standard C types, macros and variables like "size_t", "LONG_MAX" and "errno", and standard environment variables like "PATH".
* gh-106919: Use role :c:macro: for referencing the C "constants" (GH-106920)Serhiy Storchaka2023-07-211-5/+5
|
* gh-105156: Update Unicode C API: remove deprecation (#105379)Victor Stinner2023-06-061-9/+0
| | | | | | _PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(), _PyUnicode_ToTitlecase() are no longer deprecated in the documentation. It's no longer needed since they now use Py_UCS4 type, rather than the deprecated Py_UNICODE type.
* gh-102304: doc: Add links to Stable ABI and Limited C API (#105345)Victor Stinner2023-06-061-1/+1
| | | | | | | | | * Add "limited-c-api" and "stable-api" references. * Rename "stable-abi-list" reference to "limited-api-list". * Makefile: Document files regenerated by "make regen-limited-abi" * Remove first empty line in generated files: - Lib/test/test_stable_abi_ctypes.py - PC/python3dll.c
* gh-105156: Deprecate the old Py_UNICODE type in C API (#105157)Victor Stinner2023-06-011-0/+2
| | | | | | | | Deprecate the old Py_UNICODE and PY_UNICODE_TYPE types in the C API: use wchar_t instead. Replace Py_UNICODE with wchar_t in multiple C files. Co-authored-by: Inada Naoki <songofacandy@gmail.com>
* gh-98836: Extend PyUnicode_FromFormat() (GH-98838)Serhiy Storchaka2023-05-211-85/+143
| | | | | | | | | * Support for conversion specifiers o (octal) and X (uppercase hexadecimal). * Support for length modifiers j (intmax_t) and t (ptrdiff_t). * Length modifiers are now applied to all integer conversions. * Support for wchar_t C strings (%ls and %lV). * Support for variable width and precision (*). * Support for flag - (left alignment).
* gh-103883: Doc: Move PyUnicode_FromObject doc (#103913)Inada Naoki2023-04-271-9/+9
| | | This API is one of Unicode creator APIs.
* gh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (#97768)Adam Turner2022-10-051-12/+12
| | | | | :c:type:`<C type>` -> :c:expr:`<C type>` Co-authored-by: Łukasz Langa <lukasz@langa.pl>
* gh-93738: Documentation C syntax (:c:type:`PyBytesObject*` -> ↵Adam Turner2022-10-041-1/+1
| | | | | :c:expr:`PyBytesObject*`) (#97782) :c:type:`PyBytesObject*` -> :c:expr:`PyBytesObject*`
* gh-93738: Documentation C syntax (:c:type:`PyUnicodeObject*` -> ↵Adam Turner2022-10-041-1/+1
| | | | | :c:expr:`PyUnicodeObject*`) (#97783) :c:type:`PyUnicodeObject*` -> :c:expr:`PyUnicodeObject*`
* gh-95781: More strict format string checking in PyUnicode_FromFormatV() ↵Serhiy Storchaka2022-08-081-3/+5
| | | | | | | | | (GH-95784) An unrecognized format character in PyUnicode_FromFormat() and PyUnicode_FromFormatV() now sets a SystemError. In previous versions it caused all the rest of the format string to be copied as-is to the result string, and any extra arguments discarded.
* Fix Unicode doc and replace use of macro with PyMem_New function (GH-94088)Pamela Fox2022-07-281-1/+1
|
* gh-93202: Always use %zd printf formatter (#93201)Victor Stinner2022-05-251-4/+0
| | | | | | | | | | | | | | | | Python now always use the ``%zu`` and ``%zd`` printf formats to format a size_t or Py_ssize_t number. Building Python 3.12 requires a C11 compiler, so these printf formats are now always supported. * PyObject_Print() and _PyObject_Dump() now use the printf %zd format to display an object reference count. * Update PY_FORMAT_SIZE_T comment. * Remove outdated notes about the %zd format in PyBytes_FromFormat() and PyUnicode_FromFormat() documentations. * configure no longer checks for the %zd format and no longer defines PY_FORMAT_SIZE_T macro in pyconfig.h. * pymacconfig.h no longer undefines PY_FORMAT_SIZE_T: macOS 10.4 is no longer supported. Python 3.12 now requires macOS 10.6 (Snow Leopard) or newer.
* gh-93103: Update PyUnicode_DecodeFSDefault() doc (#93105)Victor Stinner2022-05-231-38/+24
| | | | | | | Update documentation of PyUnicode_DecodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and PyUnicode_EncodeFSDefault(): they now use the filesystem encoding and error handler of PyConfig, Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors variables are no longer used.
* Document Py_ssize_t. (GH-92512)Julien Palard2022-05-131-4/+4
| | | | | | It fixes 252 errors from a Sphinx nitpicky run (sphinx-build -n). But there's 8182 errors left. Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
* gh-92536: Doc update about Py_UNICODE removal (GH-92756)Inada Naoki2022-05-131-1/+1
|
* gh-92536: PEP 623: Remove wstr and legacy APIs from Unicode (GH-92537)Inada Naoki2022-05-121-156/+21
|
* gh-89653: PEP 670: Fix Sphinx syntax in Unicode doc (#92707)Victor Stinner2022-05-121-4/+4
|
* gh-89653: PEP 670: unicodeobject.h uses _Py_CAST() (#92696)Victor Stinner2022-05-111-3/+3
| | | | | | | | | | Use _Py_CAST() and _Py_STATIC_CAST() in macros wrapping static inline functions of unicodeobject.h. Change also the kind type from unsigned int to int: same parameter type than PyUnicode_FromKindAndData(). The limited API version 3.11 no longer casts arguments to expected types.
* gh-89653: PEP 670: Update C API unicode documentation (#92702)Victor Stinner2022-05-111-10/+11
|
* Document the lifetime of `PyUnicode_AsUTF8String` (#92325)Matt Wozniski2022-05-061-1/+2
| | | The current wording implied this, but didn't state it explicitly.
* gh-89653: PEP 670: Amend docs (GH-91813)Erlend Egeberg Aasland2022-04-221-11/+12
|
* Minor fixes to C API docs (GH-31501)Jelle Zijlstra2022-02-231-4/+4
| | | | | | | | | | | | | | | | * C API docs: move PyErr_SetImportErrorSubclass docs It was in the section about warnings, but it makes more sense to put it with PyErr_SetImportError. * C API docs: document closeit argument to PyRun_AnyFileExFlags It was already documented for PyRun_SimpleFileExFlags. * textual fixes to unicode docs * Move paragraph about tp_dealloc into tp_dealloc section * __aiter__ returns an async iterator, not an awaitable
* closes bpo-46253: Change Py_UNICODE to Py_UCS4 in the C API docs to match ↵Julian Gilbey2022-01-111-17/+17
| | | | the current source code (GH-30387)
* bpo-43565: Document PyUnicode_KIND's return type as an unsigned int (GH-25724)Ammar Askar2021-07-291-1/+1
|
* bpo-39560: Document PyUnicode_FromKindAndData() kind transformation (GH-23848)Zackery Spytz2021-06-031-0/+6
|
* bpo-44029: Remove Py_UNICODE APIs (GH-25881)Inada Naoki2021-05-071-185/+0
| | | | | | | | | | | | Remove deprecated `Py_UNICODE` APIs: `PyUnicode_Encode`, `PyUnicode_EncodeUTF7`, `PyUnicode_EncodeUTF8`, `PyUnicode_EncodeUTF16`, `PyUnicode_EncodeUTF32`, `PyUnicode_EncodeLatin1`, `PyUnicode_EncodeMBCS`, `PyUnicode_EncodeDecimal`, `PyUnicode_EncodeRawUnicodeEscape`, `PyUnicode_EncodeCharmap`, `PyUnicode_EncodeUnicodeEscape`, `PyUnicode_TransformDecimalToASCII`, `PyUnicode_TranslateCharmap`, `PyUnicodeEncodeError_Create`, `PyUnicodeTranslateError_Create`. See :pep:`393` and :pep:`624` for reference.
* bpo-43506: Doc: Update removal schedule for Py_UNICODE encoder APIs (GH-24885)Inada Naoki2021-03-161-11/+15
| | | See PEP 624.
* bpo-36346: Document removal schedule of deprecate APIs (GH-20879)Inada Naoki2021-02-221-4/+3
| | | We will remove wstr cache in Python 3.12. See PEP 623.
* bpo-42528: Improve the docs of most Py*_Check{,Exact} API calls (GH-23602)Antonio Cuni2021-01-061-2/+2
| | | | | I think that none of these API calls can fail, but only few of them are documented as such. Add the sentence "This function always succeeds" (which is the same already used e.g. by PyNumber_Check) to all of them.
* bpo-42236: Enhance init and encoding documentation (GH-23109)Victor Stinner2020-11-021-6/+5
| | | | | | | | | | | | | | | | | | | | | Enhance the documentation of the Python startup, filesystem encoding and error handling, locale encoding. Add a new "Python UTF-8 Mode" section. * Add "locale encoding" and "filesystem encoding and error handler" to the glossary * Remove documentation from Include/cpython/initconfig.h: move it to Doc/c-api/init_config.rst. * Doc/c-api/init_config.rst: * Document command line options and environment variables * Document default values. * Add a new "Python UTF-8 Mode" section in Doc/library/os.rst. * Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs. * Document how Python selects the filesystem encoding and error handler at a single place: PyConfig.filesystem_encoding and PyConfig.filesystem_errors. * PyConfig: move orig_argv member at the right place.
* bpo-41784: make PyUnicode_AsUTF8AndSize part of the limited API (GH-22252)Alex Gaynor2020-10-191-0/+3
|