cpython.git - https://github.com/python/cpython.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	gh-92536: Remove PyUnicode_READY() calls (#105210)	Victor Stinner	2023-06-01	1	-4/+0
\| \| \| \|	Since Python 3.12, PyUnicode_READY() does nothing and always returns 0.
*	gh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (gh-104205)	Eric Snow	2023-05-05	1	-0/+1
\| \| \|	Here we are doing no more than adding the value for Py_mod_multiple_interpreters and using it for stdlib modules. We will start checking for it in gh-104206 (once PyInterpreterState.ceval.own_gil is added in gh-104204).
*	gh-101372: Fix unicodedata.is_normalized to properly handle the UCD 3… ↵	Dong-hee Na	2023-02-06	1	-1/+1
\| \| \| \|	(gh-101388)
*	gh-99300: Use Py_NewRef() in Modules/ directory (#99473)	Victor Stinner	2022-11-14	1	-20/+10
\| \| \| \|	Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in test C files of the Modules/ directory.
*	closes gh-96734: Update to Unicode 15.0.0. (GH-96809)	Benjamin Peterson	2022-09-13	1	-2/+3
\|
*	Remove usage of _Py_IDENTIFIER from unicodedata module. (GH-91532)	Dong-hee Na	2022-04-15	1	-14/+8
\|
*	bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized ↵	Eric Snow	2022-02-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	global objects. (gh-30928) We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules. The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings). https://bugs.python.org/issue46541#msg411799 explains the rationale for this change. The core of the change is in: * (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros * Include/internal/pycore_runtime_init.h - added the static initializers for the global strings * Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState * Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config. The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _PyId functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _PyId(), replacing the _Py_Identifier * parameter with PyObject . The following are not changed (yet): stop using _Py_IDENTIFIER() in the stdlib modules * (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API * (maybe) intern the strings during runtime init https://bugs.python.org/issue46541
*	bpo-43974: Move Py_BUILD_CORE_MODULE into module code (GH-29157)	Christian Heimes	2021-10-22	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	setup.py no longer defines Py_BUILD_CORE_MODULE. Instead every module defines the macro before #include "Python.h" unless Py_BUILD_CORE_BUILTIN is already defined. Py_BUILD_CORE_BUILTIN is defined for every module that is built by Modules/Setup. The PR also simplifies Modules/Setup. Makefile and makesetup already define Py_BUILD_CORE_BUILTIN and include Modules/internal for us. Signed-off-by: Christian Heimes <christian@python.org>
*	closes bpo-45190: Update Unicode data to version 14.0.0. (GH-28336)	Benjamin Peterson	2021-09-14	1	-3/+3
\|
*	bpo-44987: Speed up unicode normalization of ASCII strings (GH-28283)	Dong-hee Na	2021-09-11	1	-0/+4
\|
*	Remove irrelevant comment which was added in 2a70a3a (GH-27044)	Srinivas Reddy Thatiparthy (శ్రీనివాస్ రెడ్డి తాటిపర్తి)	2021-07-09	1	-1/+0
\|
*	bpo-43908: Make heap types converted during 3.10 alpha immutable (GH-26351)	Erlend Egeberg Aasland	2021-06-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Make functools types immutable * Multibyte codec types are now immutable * pyexpat.xmlparser is now immutable * array.arrayiterator is now immutable * _thread types are now immutable * _csv types are now immutable * _queue.SimpleQueue is now immutable * mmap.mmap is now immutable * unicodedata.UCD is now immutable * sqlite3 types are now immutable * _lsprof.Profiler is now immutable * _overlapped.Overlapped is now immutable * _operator types are now immutable * winapi__overlapped.Overlapped is now immutable * _lzma types are now immutable * _bz2 types are now immutable * _dbm.dbm and _gdbm.gdbm are now immutable
*	bpo-42972: Fully support GC for pyexpat, unicodedata, and dbm/gdbm heap ↵	Erlend Egeberg Aasland	2021-05-27	1	-3/+14
\| \| \| \| \| \| \|	types (GH-26376) * bpo-42972: pyexpat * bpo-42972: unicodedata * bpo-42972: dbm/gdbm
*	Do not use Py_ssize_clean_t (GH-25940)	Inada Naoki	2021-05-08	1	-2/+2
\|
*	bpo-43916: Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to selected types (GH-25748)	Erlend Egeberg Aasland	2021-04-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to the following types: * _dbm.dbm * _gdbm.gdbm * _multibytecodec.MultibyteCodec * _sre..SRE_Scanner * _thread._localdummy * _thread.lock * _winapi.Overlapped * array.arrayiterator * functools.KeyWrapper * functools._lru_list_elem * pyexpat.xmlparser * re.Match * re.Pattern * unicodedata.UCD * zlib.Compress * zlib.Decompress
*	bpo-41798: Allocate unicodedata CAPI on the heap (GH-24128)	Erlend Egeberg Aasland	2021-01-20	1	-8/+29
\|
*	bpo-42519: Replace PyObject_MALLOC() with PyObject_Malloc() (GH-23587)	Victor Stinner	2020-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \|	No longer use deprecated aliases to functions: * Replace PyObject_MALLOC() with PyObject_Malloc() * Replace PyObject_REALLOC() with PyObject_Realloc() * Replace PyObject_FREE() with PyObject_Free() * Replace PyObject_Del() with PyObject_Free() * Replace PyObject_DEL() with PyObject_Free()
*	bpo-42157: Rename unicodedata.ucnhash_CAPI (GH-22994)	Victor Stinner	2020-10-27	1	-2/+2
\| \| \| \| \| \| \|	Removed the unicodedata.ucnhash_CAPI attribute which was an internal PyCapsule object. The related private _PyUnicode_Name_CAPI structure was moved to the internal C API. Rename unicodedata.ucnhash_CAPI as unicodedata._ucnhash_CAPI.
*	bpo-42157: Convert unicodedata.UCD to heap type (GH-22991)	Victor Stinner	2020-10-26	1	-76/+44
\| \| \| \| \| \| \|	Convert the unicodedata extension module to the multiphase initialization API (PEP 489) and convert the unicodedata.UCD static type to a heap type. Co-Authored-By: Mohamed Koubaa <koubaa.m@gmail.com>
*	bpo-42157: unicodedata avoids references to UCD_Type (GH-22990)	Victor Stinner	2020-10-26	1	-105/+111
\| \| \| \| \| \| \| \| \| \|	* UCD_Check() uses PyModule_Check() * Simplify the internal _PyUnicode_Name_CAPI structure: * Remove size and state members * Remove state and self parameters of getcode() and getname() functions * Remove global_module_state
*	bpo-1635741: _PyUnicode_Name_CAPI moves to internal C API (GH-22713)	Victor Stinner	2020-10-26	1	-13/+15
\| \| \| \| \| \| \| \| \| \|	The private _PyUnicode_Name_CAPI structure of the PyCapsule API unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the structure gets a new state member which must be passed to the getcode() and getname() functions. * Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h * unicodedata module is now built with Py_BUILD_CORE_MODULE. * unicodedata: move hashAPI variable into unicodedata_module_state.
*	bpo-1635741: Add a global module state to unicodedata (GH-22712)	Victor Stinner	2020-10-15	1	-54/+107
\| \| \| \| \| \|	Prepare unicodedata to add a state per module: start with a global "module" state, pass it to subfunctions which access &UCD_Type. This change also prepares the conversion of the UCD_Type static type to a heap type.
*	bpo-1635741, unicodedata: add ucd_type parameter to UCD_Check() macro (GH-22328)	Mohamed Koubaa	2020-09-23	1	-13/+16
\| \| \|	Co-authored-by: Victor Stinner <vstinner@python.org>
*	bpo-40268: Remove unused structmember.h includes (GH-19530)	Victor Stinner	2020-04-15	1	-1/+1
\| \| \| \| \| \|	If only offsetof() is needed: include stddef.h instead. When structmember.h is used, add a comment explaining that PyMemberDef is used.
*	bpo-39943: Add the const qualifier to pointers on non-mutable PyUnicode ↵	Serhiy Storchaka	2020-04-11	1	-3/+3
\| \| \| \|	data. (GH-19345)
*	bpo-39943: Remove unused self from find_nfc_index() (GH-18973)	Andy Lester	2020-03-17	1	-4/+4
\|
*	closes bpo-39926: Update Unicode to 13.0.0. (GH-18910)	Benjamin Peterson	2020-03-11	1	-4/+5
\|
*	bpo-39573: Clean up modules and headers to use Py_IS_TYPE() function (GH-18521)	Dong-hee Na	2020-02-17	1	-1/+1
\|
*	bpo-39573: Add Py_SET_TYPE() function (GH-18394)	Victor Stinner	2020-02-07	1	-1/+1
\| \| \|	Add Py_SET_TYPE() function to set the type of an object.
*	bpo-37752: Delete redundant Py_CHARMASK in normalizestring() (GH-15095)	Jordon Xu	2019-09-10	1	-2/+2
\|
*	bpo-38043: Use `bool` for boolean flags on is_normalized_quickcheck. (GH-15711)	Greg Price	2019-09-09	1	-11/+11
\|
*	closes bpo-37966: Fully implement the UAX #15 quick-check algorithm. (GH-15558)	Greg Price	2019-09-04	1	-24/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The purpose of the `unicodedata.is_normalized` function is to answer the question `str == unicodedata.normalized(form, str)` more efficiently than writing just that, by using the "quick check" optimization described in the Unicode standard in UAX #15. However, it turns out the code doesn't implement the full algorithm from the standard, and as a result we often miss the optimization and end up having to compute the whole normalized string after all. Implement the standard's algorithm. This greatly speeds up `unicodedata.is_normalized` in many cases where our partial variant of quick-check had been returning MAYBE and the standard algorithm returns NO. At a quick test on my desktop, the existing code takes about 4.4 ms/MB (so 4.4 ns per byte) when the partial quick-check returns MAYBE and it has to do the slow normalize-and-compare: $ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"500000' \ -- 'unicodedata.is_normalized("NFD", s)' 50 loops, best of 5: 4.39 msec per loop With this patch, it gets the answer instantly (58 ns) on the same 1 MB string: $ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"500000' \ -- 'unicodedata.is_normalized("NFD", s)' 5000000 loops, best of 5: 58.2 nsec per loop This restores a small optimization that the original version of this code had for the `unicodedata.normalize` use case. With this, that case is actually faster than in master! $ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 561 usec per loop $ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"500000' \ -- 'unicodedata.normalize("NFD", s)' 500 loops, best of 5: 512 usec per loop
*	bpo-36974: tp_print -> tp_vectorcall_offset and tp_reserved -> tp_as_async ↵	Jeroen Demeyer	2019-05-31	1	-2/+2
\| \| \| \| \| \| \| \| \|	(GH-13464) Automatically replace tp_print -> tp_vectorcall_offset tp_compare -> tp_as_async tp_reserved -> tp_as_async
*	bpo-36642: make unicodedata const (GH-12855)	Inada Naoki	2019-04-16	1	-1/+1
\|
*	closes bpo-32285: Add unicodedata.is_normalized. (GH-4806)	Max Bélanger	2018-11-04	1	-17/+98
\|
*	bpo-29456: Fix bugs in unicodedata.normalize: u1176, u11a7 and u11c3 (GH-1958)	Wonsup Yoon	2018-06-15	1	-3/+7
\| \| \| \| \|	Hangul composition check boundaries are wrong for the second character ([0x1161, 0x1176) instead of [0x1161, 0x1176]) and third character ((0x11A7, 0x11C3) instead of [0x11A7, 0x11C3]).
*	update to Unicode 11.0.0 (closes bpo-33778) (GH-7439)	Benjamin Peterson	2018-06-07	1	-1/+1
\| \| \|	Also, standardize indentation of generated tables.
*	Fix miscellaneous typos (#4275)	luzpaz	2017-11-05	1	-1/+1
\|
*	bpo-30736: upgrade to Unicode 10.0 (#2344)	Benjamin Peterson	2017-06-23	1	-2/+3
\| \| \|	Straightforward. While we're at it, though, strip trailing whitespace from generated tables.
*	Issue #28511: Use the "U" format instead of "O!" in PyArg_Parse*.	Serhiy Storchaka	2016-10-23	1	-5/+2
\|
*	Add an extra byte for null in case we ever get very long unicode names.	Christian Heimes	2016-09-23	1	-4/+4
\|\
\| *	Add an extra byte for null in case we ever get very long unicode names.	Christian Heimes	2016-09-23	1	-4/+4
\| \|
* \|	Unicode 9.0.0	Benjamin Peterson	2016-09-15	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Not completely mechanical since support for East Asian Width changes—emoji codepoints became Wide—had to be added to unicodedata.
* \|	Restrict name_length to NAME_MAXLEN in unicodedata_UCD_lookup()	Christian Heimes	2016-09-14	1	-1/+1
\|\ \ \| \|/
\| *	Restrict name_length to NAME_MAXLEN in unicodedata_UCD_lookup()	Christian Heimes	2016-09-14	1	-1/+1
\| \|
* \|	Issue #25923: Added the const qualifier to static constant arrays.	Serhiy Storchaka	2015-12-25	1	-2/+2
\|/
*	upgrade to Unicode 8.0.0	Benjamin Peterson	2015-06-27	1	-2/+3
\|
*	Issue #24000: Improved Argument Clinic's mapping of converters to legacy	Larry Hastings	2015-05-08	1	-2/+2
\| \| \| \|	"format units". Updated the documentation to match.
*	Issue #24001: Argument Clinic converters now use accept={type}	Larry Hastings	2015-05-04	1	-22/+22
\| \| \| \|	instead of types={'type'} to specify the types the converter accepts.
*	Issue #20181: Converted the unicodedata module to Argument Clinic.	Serhiy Storchaka	2015-04-17	1	-227/+196
\|