summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* bpo-41123: Remove Py_UNICODE_str* functions (GH-21164)Inada Naoki2020-06-271-88/+0
| | | They are undocumented and deprecated since Python 3.3.
* bpo-40521: Optimize PyBytes_FromStringAndSize(str, 0) (GH-21142)Victor Stinner2020-06-251-24/+35
| | | | | | | | | | | | | | | | Always create the empty bytes string singleton. Optimize PyBytes_FromStringAndSize(str, 0): it no longer has to check if the empty string singleton was created or not, it is always available. Add functions: * _PyBytes_Init() * bytes_get_empty(), bytes_new_empty() * bytes_create_empty_string_singleton() * unicode_create_empty_string_singleton() _Py_unicode_state: rename empty structure member to empty_string.
* bpo-40521: Make Unicode latin1 singletons per interpreter (GH-21101)Victor Stinner2020-06-241-42/+32
| | | | | | | | | Each interpreter now has its own Unicode latin1 singletons. Remove "ifdef EXPERIMENTAL_ISOLATED_SUBINTERPRETERS" and "ifdef LATIN1_SINGLETONS": always enable latin1 singletons. Optimize unicode_result_ready(): only attempt to get a latin1 singleton for PyUnicode_1BYTE_KIND.
* bpo-40521: Optimize PyUnicode_New(0, maxchar) (GH-21099)Victor Stinner2020-06-231-55/+25
| | | | | | Functions of unicodeobject.c, like PyUnicode_New(), no longer check if the empty Unicode singleton has been initialized or not. Consider that it is always initialized. The Unicode API must not be used before _PyUnicode_Init() or after _PyUnicode_Fini().
* bpo-40521: Make empty Unicode string per interpreter (GH-21096)Victor Stinner2020-06-231-73/+117
| | | Each interpreter now has its own empty Unicode string singleton.
* bpo-36346: Add Py_DEPRECATED to deprecated unicode APIs (GH-20878)Inada Naoki2020-06-171-0/+23
| | | | Co-authored-by: Kyle Stanley <aeros167@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org>
* bpo-40989: PyObject_INIT() becomes an alias to PyObject_Init() (GH-20901)Victor Stinner2020-06-151-6/+7
| | | | | | | | | | | | | | The PyObject_INIT() and PyObject_INIT_VAR() macros become aliases to, respectively, PyObject_Init() and PyObject_InitVar() functions. Rename _PyObject_INIT() and _PyObject_INIT_VAR() static inline functions to, respectively, _PyObject_Init() and _PyObject_InitVar(), and move them to pycore_object.h. Remove their return value: their return type becomes void. The _datetime module is now built with the Py_BUILD_CORE_MODULE macro defined. Remove an outdated comment on _Py_tracemalloc_config.
* bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781)Victor Stinner2020-06-101-35/+33
| | | | | | | The PEP 353, written in 2005, introduced PY_FORMAT_SIZE_T. Python no longer supports macOS 10.4 and Visual Studio 2010, but requires more recent macOS and Visual Studio versions. In 2020 with Python 3.10, it is now safe to use directly "%zu" to format size_t and "%zi" to format Py_ssize_t.
* bpo-40881: Fix unicode_release_interned() (GH-20699)Victor Stinner2020-06-071-2/+2
| | | Use Py_SET_REFCNT() in unicode_release_interned().
* bpo-39465: Cleanup _PyUnicode_FromId() code (GH-20595)Victor Stinner2020-06-021-10/+16
| | | Work on a local variable before filling _Py_Identifier members.
* bpo-40792: Make the result of PyNumber_Index() always having exact type int. ↵Serhiy Storchaka2020-05-281-1/+1
| | | | | | | | | | | | (GH-20443) Previously, the result could have been an instance of a subclass of int. Also revert bpo-26202 and make attributes start, stop and step of the range object having exact type int. Add private function _PyNumber_Index() which preserves the old behavior of PyNumber_Index() for performance to use it in the conversion functions like PyLong_AsLong().
* bpo-40521: Add PyInterpreterState.unicode (GH-20081)Victor Stinner2020-05-131-31/+33
| | | | | | | Move PyInterpreterState.fs_codec into a new PyInterpreterState.unicode structure. Give a name to the fs_codec structure and use this structure in unicodeobject.c.
* bpo-39465: Remove _PyUnicode_ClearStaticStrings() from C API (GH-20078)Victor Stinner2020-05-131-3/+3
| | | | Remove the _PyUnicode_ClearStaticStrings() function from the C API. Make the function fully private (declare it with "static").
* bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing ↵Serhiy Storchaka2020-05-121-4/+22
| | | | non-BMP characters on Windows. (GH-20053)
* bpo-40593: Improve syntax errors for invalid characters in source code. ↵Serhiy Storchaka2020-05-121-23/+41
| | | | (GH-20033)
* bpo-40521: Disable Unicode caches in isolated subinterpreters (GH-19933)Victor Stinner2020-05-051-15/+63
| | | | | | | When Python is built in the experimental isolated subinterpreters mode, disable Unicode singletons and Unicode interned strings since they are shared by all interpreters. Temporary workaround until these caches are made per-interpreter.
* bpo-39939: Add str.removeprefix and str.removesuffix (GH-18939)sweeneyde2020-04-221-0/+57
| | | | | Added str.removeprefix and str.removesuffix methods and corresponding bytes, bytearray, and collections.UserString methods to remove affixes from a string if present. See PEP 616 for a full description.
* bpo-40268: Remove a few pycore_pystate.h includes (GH-19510)Victor Stinner2020-04-141-2/+3
|
* bpo-40268: Rename _PyInterpreterState_GET_UNSAFE() (GH-19509)Victor Stinner2020-04-141-4/+4
| | | | | | | Rename _PyInterpreterState_GET_UNSAFE() to _PyInterpreterState_GET() for consistency with _PyThreadState_GET() and to have a shorter name (help to fit into 80 columns). Add also "assert(tstate != NULL);" to the function.
* bpo-40268: Add _PyInterpreterState_GetConfig() (GH-19492)Victor Stinner2020-04-131-7/+9
| | | | | | | | Don't access PyInterpreterState.config member directly anymore, but use new functions: * _PyInterpreterState_GetConfig() * _PyInterpreterState_SetConfig() * _Py_GetConfig()
* bpo-39943: Add the const qualifier to pointers on non-mutable PyBytes data. ↵Serhiy Storchaka2020-04-121-3/+3
| | | | (GH-19472)
* bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode ↵Serhiy Storchaka2020-04-111-133/+162
| | | | data. (GH-19345)
* bpo-40170: Add _PyIndex_Check() internal function (GH-19426)Victor Stinner2020-04-081-1/+2
| | | | | | | | | Add _PyIndex_Check() function to the internal C API: fast inlined verson of PyIndex_Check(). Add Include/internal/pycore_abstract.h header file. Replace PyIndex_Check() with _PyIndex_Check() in C files of Objects and Python subdirectories.
* bpo-37388: Don't check encoding/errors during finalization (GH-19409)Victor Stinner2020-04-071-0/+6
| | | | | | | | | str.encode() and str.decode() no longer check the encoding and errors in development mode or in debug mode during Python finalization. The codecs machinery can no longer work on very late calls to str.encode() and str.decode(). This change should help to call _PyObject_Dump() to debug during late Python finalization.
* bpo-40130: _PyUnicode_AsKind() should not be exported. (GH-19265)Serhiy Storchaka2020-04-011-49/+46
| | | | | Make it a static function, and pass known attributes (kind, data, length) instead of the PyUnicode object.
* Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985)Inada Naoki2020-03-141-35/+0
| | | | | | | * Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)" This reverts commit c7ad974d341d3edb6b9d2a2dcae4d3d4794ada6b. * Update unicodeobject.h
* bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)Inada Naoki2020-03-141-0/+35
| | | Co-authored-by: Victor Stinner <vstinner@python.org>
* bpo-39573: Finish converting to new Py_IS_TYPE() macro (GH-18601)Andy Lester2020-03-041-2/+2
|
* bpo-39087: Optimize PyUnicode_AsUTF8AndSize() (GH-18327)Inada Naoki2020-02-271-25/+73
| | | Avoid using temporary bytes object.
* closes bpo-39684: Combine two if/thens and squash uninit var warning. (GH-18565)Andy Lester2020-02-211-8/+3
|
* bpo-39500: Fix compile warnings in unicodeobject.c (GH-18519)Hai Shi2020-02-171-2/+2
|
* bpo-35081: Move bytes_methods.h to the internal C API (GH-18492)Victor Stinner2020-02-121-1/+1
| | | | | Move the bytes_methods.h header file to the internal C API as pycore_bytes_methods.h: it only contains private symbols (prefixed by "_Py"), except of the PyDoc_STRVAR_shared() macro.
* bpo-39605: Remove a cast that causes a warning. (GH-18473)Benjamin Peterson2020-02-121-1/+1
|
* closes bpo-39605: Fix some casts to not cast away const. (GH-18453)Andy Lester2020-02-121-15/+15
| | | | | | | | | | | | | | | gcc -Wcast-qual turns up a number of instances of casting away constness of pointers. Some of these can be safely modified, by either: Adding the const to the type cast, as in: - return _PyUnicode_FromUCS1((unsigned char*)s, size); + return _PyUnicode_FromUCS1((const unsigned char*)s, size); or, Removing the cast entirely, because it's not necessary (but probably was at one time), as in: - PyDTrace_FUNCTION_ENTRY((char *)filename, (char *)funcname, lineno); + PyDTrace_FUNCTION_ENTRY(filename, funcname, lineno); These changes will not change code, but they will make it much easier to check for errors in consts
* bpo-39245: Switch to public API for Vectorcall (GH-18460)Petr Viktorin2020-02-111-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The bulk of this patch was generated automatically with: for name in \ PyObject_Vectorcall \ Py_TPFLAGS_HAVE_VECTORCALL \ PyObject_VectorcallMethod \ PyVectorcall_Function \ PyObject_CallOneArg \ PyObject_CallMethodNoArgs \ PyObject_CallMethodOneArg \ ; do echo $name git grep -lwz _$name | xargs -0 sed -i "s/\b_$name\b/$name/g" done old=_PyObject_FastCallDict new=PyObject_VectorcallDict git grep -lwz $old | xargs -0 sed -i "s/\b$old\b/$new/g" and then cleaned up: - Revert changes to in docs & news - Revert changes to backcompat defines in headers - Nudge misaligned comments
* bpo-39500: Document PyUnicode_IsIdentifier() function (GH-18397)Victor Stinner2020-02-111-14/+33
| | | | PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the string is not ready.
* bpo-39573: Use Py_TYPE() macro in Objects directory (GH-18392)Victor Stinner2020-02-071-4/+4
| | | Replace direct access to PyObject.ob_type with Py_TYPE().
* bpo-39573: Add Py_SET_REFCNT() function (GH-18389)Victor Stinner2020-02-071-2/+2
| | | | Add a Py_SET_REFCNT() function to set the reference counter of an object.
* Add PyInterpreterState.fs_codec.utf8 (GH-18367)Victor Stinner2020-02-051-46/+47
| | | | | | Add a fast-path for UTF-8 encoding in PyUnicode_EncodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize(). Add _PyUnicode_FiniEncodings() helper function for _PyUnicode_Fini().
* bpo-39542: Simplify _Py_NewReference() (GH-18332)Victor Stinner2020-02-031-1/+5
| | | | | | | | | * Remove _Py_INC_REFTOTAL and _Py_DEC_REFTOTAL macros: modify directly _Py_RefTotal. * _Py_ForgetReference() is no longer defined if the Py_TRACE_REFS macro is not defined. * Remove _Py_NewReference() implementation from object.c: unify the two implementations in object.h inline function. * Fix Py_TRACE_REFS build: _Py_INC_TPALLOCS() macro has been removed.
* bpo-38631: Avoid Py_FatalError() in unicodeobject.c (GH-18281)Victor Stinner2020-01-301-23/+28
| | | | | Replace Py_FatalError() calls with _PyErr_WriteUnraisableMsg(), _PyObject_ASSERT_FAILED_MSG() or Py_UNREACHABLE() in unicode_dealloc() and unicode_release_interned().
* Fix compiler warning in Objects/unicodeobject.c (GH-17440)Pablo Galindo2019-12-021-1/+1
|
* bpo-38896: Remove PyUnicode_ClearFreeList() function (GH-17354)Victor Stinner2019-11-231-9/+0
| | | | Remove PyUnicode_ClearFreeList() function: the Unicode free list has been removed in Python 3.3.
* bpo-38858: Call _PyUnicode_Fini() in Py_EndInterpreter() (GH-17330)Victor Stinner2019-11-221-16/+19
| | | Py_EndInterpreter() now clears the filesystem codec.
* bpo-28029: Make "".replace("", s, n) returning s for any n != 0. (GH-16981)Serhiy Storchaka2019-10-301-1/+4
|
* bpo-38409: Fix grammar in str.strip() docstring (GH-16682)Zachary Ware2019-10-091-2/+2
|
* bpo-36389: _PyObject_CheckConsistency() available in release mode (GH-16612)Victor Stinner2019-10-071-42/+42
| | | | | | | | | | | | | | | | | | | | | bpo-36389, bpo-38376: The _PyObject_CheckConsistency() function is now also available in release mode. For example, it can be used to debug a crash in the visit_decref() function of the GC. Modify the following functions to also work in release mode: * _PyDict_CheckConsistency() * _PyObject_CheckConsistency() * _PyType_CheckConsistency() * _PyUnicode_CheckConsistency() Other changes: * _PyMem_IsPtrFreed(ptr) now also returns 1 if ptr is NULL (equals to 0). * _PyBytesWriter_CheckConsistency() now returns 1 and is only used with assert(). * Reorder _PyObject_Dump() to write safe fields first, and only attempt to render repr() at the end.
* bpo-38353: Cleanup includes in the internal C API (GH-16548)Victor Stinner2019-10-021-1/+2
| | | | Use forward declaration of types to avoid includes in the internal C API. Add also comment to justify other includes.
* bpo-38236: Dump path config at first import error (GH-16300)Victor Stinner2019-09-231-9/+10
| | | | Python now dumps path configuration if it fails to import the Python codecs of the filesystem and stdio encodings.
* bpo-37206: Unrepresentable default values no longer represented as None. ↵Serhiy Storchaka2019-09-141-5/+5
| | | | | | | (GH-13933) In ArgumentClinic, value "NULL" should now be used only for unrepresentable default values (like in the optional third parameter of getattr). "None" should be used if None is accepted as argument and passing None has the same effect as not passing the argument at all.