summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* bpo-40998: Address compiler warnings found by ubsan (GH-20929)Miss Islington (bot)2020-11-181-1/+5
| | | | | | | | Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: GH:tiran (cherry picked from commit 07f2adedf0940b06d136208ec386d69b7d2d5b43) Co-authored-by: Christian Heimes <christian@python.org>
* Fix typo in unicodeobject.c (GH-23180)Miss Islington (bot)2020-11-101-1/+1
| | | | | | | | exeeds -> exceeds Automerge-Triggered-By: GH:Mariatta (cherry picked from commit 38811d68caf9b782ea7168479acb09557e126efe) Co-authored-by: Ikko Ashimine <eltociear@gmail.com>
* bpo-42065: Fix incorrectly formatted _codecs.charmap_decode error message ↵Miss Skeleton (bot)2020-10-181-1/+1
| | | | | | | (GH-19940) (cherry picked from commit 3635388f52b42e5280229104747962117104c453) Co-authored-by: Max Bernstein <tekknolagi@users.noreply.github.com>
* [3.9] bpo-41909: Enable previously disabled recursion checks. (GH-22536) ↵Serhiy Storchaka2020-10-041-2/+0
| | | | | | | | | | | | | | | (GH-22550) Enable recursion checks which were disabled when get __bases__ of non-type objects in issubclass() and isinstance() and when intern strings. It fixes a stack overflow when getting __bases__ leads to infinite recursion. Originally recursion checks was disabled for PyDict_GetItem() which silences all errors including the one raised in case of detected recursion and can return incorrect result. But now the code uses PyDict_GetItemWithError() and PyDict_SetDefault() instead. (cherry picked from commit 9ece9cd65cdeb0a1f6e60475bbd0219161c348ac)
* bpo-36346: Add Py_DEPRECATED to deprecated unicode APIs (GH-20878)Inada Naoki2020-06-181-0/+23
| | | | | Co-authored-by: Kyle Stanley <aeros167@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org> (cherry picked from commit 2c4928d37edc5e4aeec3c0b79fa3460b1ec9b60d)
* [3.9] bpo-40514: Remove --with-experimental-isolated-subinterpreters in 3.9 ↵Victor Stinner2020-05-191-8/+2
| | | | | | | (GH-20228) Remove --with-experimental-isolated-subinterpreters configure option in Python 3.9: the experiment continues in the master branch, but it's no longer needed in 3.9.
* bpo-40521: Add PyInterpreterState.unicode (GH-20081)Victor Stinner2020-05-131-31/+33
| | | | | | | Move PyInterpreterState.fs_codec into a new PyInterpreterState.unicode structure. Give a name to the fs_codec structure and use this structure in unicodeobject.c.
* bpo-39465: Remove _PyUnicode_ClearStaticStrings() from C API (GH-20078)Victor Stinner2020-05-131-3/+3
| | | | Remove the _PyUnicode_ClearStaticStrings() function from the C API. Make the function fully private (declare it with "static").
* bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing ↵Serhiy Storchaka2020-05-121-4/+22
| | | | non-BMP characters on Windows. (GH-20053)
* bpo-40593: Improve syntax errors for invalid characters in source code. ↵Serhiy Storchaka2020-05-121-23/+41
| | | | (GH-20033)
* bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933)Victor Stinner2020-05-051-15/+63
| | | | | | | When Python is built in the experimental isolated subinterpreters mode, disable Unicode singletons and Unicode interned strings since they are shared by all interpreters. Temporary workaround until these caches are made per-interpreter.
* bpo-39939: Add str.removeprefix and str.removesuffix (GH-18939)sweeneyde2020-04-221-0/+57
| | | | | Added str.removeprefix and str.removesuffix methods and corresponding bytes, bytearray, and collections.UserString methods to remove affixes from a string if present. See PEP 616 for a full description.
* bpo-40268: Remove a few pycore_pystate.h includes (GH-19510)Victor Stinner2020-04-141-2/+3
|
* bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509)Victor Stinner2020-04-141-4/+4
| | | | | | | Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET() for consistency with _PyThreadState_GET() and to have a shorter name (help to fit into 80 columns). Add also "assert(tstate != NULL);" to the function.
* bpo-40268: Add _PyInterpreterState_GetConfig() (GH-19492)Victor Stinner2020-04-131-7/+9
| | | | | | | | Don't access PyInterpreterState.config member directly anymore, but use new functions: * _PyInterpreterState_GetConfig() * _PyInterpreterState_SetConfig() * _Py_GetConfig()
* bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. ↵Serhiy Storchaka2020-04-121-3/+3
| | | | (GH-19472)
* bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode ↵Serhiy Storchaka2020-04-111-133/+162
| | | | data. (GH-19345)
* bpo-40170: Add _PyIndex_Check() internal function (GH-19426)Victor Stinner2020-04-081-1/+2
| | | | | | | | | Add _PyIndex_Check() function to the internal C API: fast inlined verson of PyIndex_Check(). Add Include/internal/pycore_abstract.h header file. Replace PyIndex_Check() with _PyIndex_Check() in C files of Objects and Python subdirectories.
* bpo-37388: Don't check encoding/errors during finalization (GH-19409)Victor Stinner2020-04-071-0/+6
| | | | | | | | | str.encode() and str.decode() no longer check the encoding and errors in development mode or in debug mode during Python finalization. The codecs machinery can no longer work on very late calls to str.encode() and str.decode(). This change should help to call _PyObject_Dump() to debug during late Python finalization.
* bpo-40130: _PyUnicode_AsKind() should not be exported. (GH-19265)Serhiy Storchaka2020-04-011-49/+46
| | | | | Make it a static function, and pass known attributes (kind, data, length) instead of the PyUnicode object.
* Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985)Inada Naoki2020-03-141-35/+0
| | | | | | | * Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)" This reverts commit c7ad974d341d3edb6b9d2a2dcae4d3d4794ada6b. * Update unicodeobject.h
* bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)Inada Naoki2020-03-141-0/+35
| | | Co-authored-by: Victor Stinner <vstinner@python.org>
* bpo-39573: Finish converting to new Py_IS_TYPE() macro (GH-18601)Andy Lester2020-03-041-2/+2
|
* bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327)Inada Naoki2020-02-271-25/+73
| | | Avoid using temporary bytes object.
* closes bpo-39684: Combine two if/thens and squash uninit var warning. (GH-18565)Andy Lester2020-02-211-8/+3
|
* bpo-39500: Fix compile warnings in unicodeobject.c (GH-18519)Hai Shi2020-02-171-2/+2
|
* bpo-35081: Move bytes_methods.h to the internal C API (GH-18492)Victor Stinner2020-02-121-1/+1
| | | | | Move the bytes_methods.h header file to the internal C API as pycore_bytes_methods.h: it only contains private symbols (prefixed by "_Py"), except of the PyDoc_STRVAR_shared() macro.
* bpo-39605: Remove a cast that causes a warning. (GH-18473)Benjamin Peterson2020-02-121-1/+1
|
* closes bpo-39605: Fix some casts to not cast away const. (GH-18453)Andy Lester2020-02-121-15/+15
| | | | | | | | | | | | | | | gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either: Adding the const to the type cast, as in: - return _PyUnicode_FromUCS1((unsigned char*)s, size); + return _PyUnicode_FromUCS1((const unsigned char*)s, size); or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in: - PyDTrace_FUNCTION_ENTRY((char *)filename, (char *)funcname, lineno); + PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno); These changes will not change code, but they will make it much easier to check for errors in consts
* bpo-39245: Switch to public API for Vectorcall (GH-18460)Petr Viktorin2020-02-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The bulk of this patch was generated automatically with: for name in \ PyObject_Vectorcall \ Py_TPFLAGS_HAVE_VECTORCALL \ PyObject_VectorcallMethod \ PyVectorcall_Function \ PyObject_CallOneArg \ PyObject_CallMethodNoArgs \ PyObject_CallMethodOneArg \ ; do echo $name git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g" done old=_PyObject_FastCallDict new=PyObject_VectorcallDict git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g" and then cleaned up: - Revert changes to in docs & news - Revert changes to backcompat defines in headers - Nudge misaligned comments
* bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397)Victor Stinner2020-02-111-14/+33
| | | | PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the string is not ready.
* bpo-39573: Use Py_TYPE() macro in Objects directory (GH-18392)Victor Stinner2020-02-071-4/+4
| | | Replace direct access to PyObject.ob_type with Py_TYPE().
* bpo-39573: Add Py_SET_REFCNT() function (GH-18389)Victor Stinner2020-02-071-2/+2
| | | | Add a Py_SET_REFCNT() function to set the reference counter of an object.
* Add PyInterpreterState.fs_codec.utf8 (GH-18367)Victor Stinner2020-02-051-46/+47
| | | | | | Add a fast-path for UTF-8 encoding in PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize(). Add _PyUnicode_FiniEncodings() helper function for _PyUnicode_Fini().
* bpo-39542: Simplify _Py_NewReference() (GH-18332)Victor Stinner2020-02-031-1/+5
| | | | | | | | | * Remove _Py_INC_REFTOTAL and _Py_DEC_REFTOTAL macros: modify directly _Py_RefTotal. * _Py_ForgetReference() is no longer defined if the Py_TRACE_REFS macro is not defined. * Remove _Py_NewReference() implementation from object.c: unify the two implementations in object.h inline function. * Fix Py_TRACE_REFS build: _Py_INC_TPALLOCS() macro has been removed.
* bpo-38631: Avoid Py_FatalError() in unicodeobject.c (GH-18281)Victor Stinner2020-01-301-23/+28
| | | | | Replace Py_FatalError() calls with _PyErr_WriteUnraisableMsg(), _PyObject_ASSERT_FAILED_MSG() or Py_UNREACHABLE() in unicode_dealloc() and unicode_release_interned().
* Fix compiler warning in Objects/unicodeobject.c (GH-17440)Pablo Galindo2019-12-021-1/+1
|
* bpo-38896: Remove PyUnicode_ClearFreeList() function (GH-17354)Victor Stinner2019-11-231-9/+0
| | | | Remove PyUnicode_ClearFreeList() function: the Unicode free list has been removed in Python 3.3.
* bpo-38858: Call _PyUnicode_Fini() in Py_EndInterpreter() (GH-17330)Victor Stinner2019-11-221-16/+19
| | | Py_EndInterpreter() now clears the filesystem codec.
* bpo-28029: Make "".replace("", s, n) returning s for any n != 0. (GH-16981)Serhiy Storchaka2019-10-301-1/+4
|
* bpo-38409: Fix grammar in str.strip() docstring (GH-16682)Zachary Ware2019-10-091-2/+2
|
* bpo-36389: _PyObject_CheckConsistency() available in release mode (GH-16612)Victor Stinner2019-10-071-42/+42
| | | | | | | | | | | | | | | | | | | | | bpo-36389, bpo-38376: The _PyObject_CheckConsistency() function is now also available in release mode. For example, it can be used to debug a crash in the visit_decref() function of the GC. Modify the following functions to also work in release mode: * _PyDict_CheckConsistency() * _PyObject_CheckConsistency() * _PyType_CheckConsistency() * _PyUnicode_CheckConsistency() Other changes: * _PyMem_IsPtrFreed(ptr) now also returns 1 if ptr is NULL (equals to 0). * _PyBytesWriter_CheckConsistency() now returns 1 and is only used with assert(). * Reorder _PyObject_Dump() to write safe fields first, and only attempt to render repr() at the end.
* bpo-38353: Cleanup includes in the internal C API (GH-16548)Victor Stinner2019-10-021-1/+2
| | | | Use forward declaration of types to avoid includes in the internal C API. Add also comment to justify other includes.
* bpo-38236: Dump path config at first import error (GH-16300)Victor Stinner2019-09-231-9/+10
| | | | Python now dumps path configuration if it fails to import the Python codecs of the filesystem and stdio encodings.
* bpo-37206: Unrepresentable default values no longer represented as None. ↵Serhiy Storchaka2019-09-141-5/+5
| | | | | | | (GH-13933) In ArgumentClinic, value "NULL" should now be used only for unrepresentable default values (like in the optional third parameter of getattr). "None" should be used if None is accepted as argument and passing None has the same effect as not passing the argument at all.
* Fix unused variable and signed/unsigned warnings (GH-15537)Raymond Hettinger2019-08-271-0/+6
|
* bpo-36311: Fixes decoding multibyte characters around chunk boundaries and ↵Steve Dower2019-08-211-6/+10
| | | | improves decoding performance (GH-15083)
* bpo-37483: add _PyObject_CallOneArg() function (#14558)Jeroen Demeyer2019-07-041-6/+4
|
* bpo-37388: Add PyUnicode_Decode(str, 0) fast-path (GH-14385)Victor Stinner2019-06-251-0/+4
| | | Add a fast-path to PyUnicode_Decode() for size equals to 0.
* bpo-37388: Development mode check encoding and errors (GH-14341)Victor Stinner2019-06-251-5/+63
| | | | | | | | | In development mode and in debug build, encoding and errors arguments are now checked on string encoding and decoding operations. Examples: open(), str.encode() and bytes.decode(). By default, for best performances, the errors argument is only checked at the first encoding/decoding error, and the encoding argument is sometimes ignored for empty strings.