| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
(gh-101388)
(cherry picked from commit 9ef7e75434587fc8f167d73eee5dd9bdca62714b)
Co-authored-by: Dong-hee Na <donghee.na@python.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(GH-26766)
* Make functools types immutable
* Multibyte codec types are now immutable
* pyexpat.xmlparser is now immutable
* array.arrayiterator is now immutable
* _thread types are now immutable
* _csv types are now immutable
* _queue.SimpleQueue is now immutable
* mmap.mmap is now immutable
* unicodedata.UCD is now immutable
* sqlite3 types are now immutable
* _lsprof.Profiler is now immutable
* _overlapped.Overlapped is now immutable
* _operator types are now immutable
* winapi__overlapped.Overlapped is now immutable
* _lzma types are now immutable
* _bz2 types are now immutable
* _dbm.dbm and _gdbm.gdbm are now immutable
(cherry picked from commit 00710e6346fd2394aa020b2dfae170093effac98)
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
|
|
|
|
|
|
|
|
|
|
| |
types (GH-26376)
* bpo-42972: pyexpat
* bpo-42972: unicodedata
* bpo-42972: dbm/gdbm
(cherry picked from commit 59af59c2dfa52dcd5605185263f266a49ced934c)
Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apply Py_TPFLAGS_DISALLOW_INSTANTIATION to the following types:
* _dbm.dbm
* _gdbm.gdbm
* _multibytecodec.MultibyteCodec
* _sre..SRE_Scanner
* _thread._localdummy
* _thread.lock
* _winapi.Overlapped
* array.arrayiterator
* functools.KeyWrapper
* functools._lru_list_elem
* pyexpat.xmlparser
* re.Match
* re.Pattern
* unicodedata.UCD
* zlib.Compress
* zlib.Decompress
|
| |
|
|
|
|
|
|
|
|
|
| |
No longer use deprecated aliases to functions:
* Replace PyObject_MALLOC() with PyObject_Malloc()
* Replace PyObject_REALLOC() with PyObject_Realloc()
* Replace PyObject_FREE() with PyObject_Free()
* Replace PyObject_Del() with PyObject_Free()
* Replace PyObject_DEL() with PyObject_Free()
|
|
|
|
|
|
|
| |
Removed the unicodedata.ucnhash_CAPI attribute which was an internal
PyCapsule object. The related private _PyUnicode_Name_CAPI structure
was moved to the internal C API.
Rename unicodedata.ucnhash_CAPI as unicodedata._ucnhash_CAPI.
|
|
|
|
|
|
|
| |
Convert the unicodedata extension module to the multiphase
initialization API (PEP 489) and convert the unicodedata.UCD static
type to a heap type.
Co-Authored-By: Mohamed Koubaa <koubaa.m@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
* UCD_Check() uses PyModule_Check()
* Simplify the internal _PyUnicode_Name_CAPI structure:
* Remove size and state members
* Remove state and self parameters of getcode() and getname()
functions
* Remove global_module_state
|
|
|
|
|
|
|
|
|
|
| |
The private _PyUnicode_Name_CAPI structure of the PyCapsule API
unicodedata.ucnhash_CAPI moves to the internal C API. Moreover, the
structure gets a new state member which must be passed to the
getcode() and getname() functions.
* Move Include/ucnhash.h to Include/internal/pycore_ucnhash.h
* unicodedata module is now built with Py_BUILD_CORE_MODULE.
* unicodedata: move hashAPI variable into unicodedata_module_state.
|
|
|
|
|
|
| |
Prepare unicodedata to add a state per module: start with a global
"module" state, pass it to subfunctions which access &UCD_Type. This
change also prepares the conversion of the UCD_Type static type to a
heap type.
|
|
|
| |
Co-authored-by: Victor Stinner <vstinner@python.org>
|
|
|
|
|
|
| |
If only offsetof() is needed: include stddef.h instead.
When structmember.h is used, add a comment explaining that
PyMemberDef is used.
|
|
|
|
| |
data. (GH-19345)
|
| |
|
| |
|
| |
|
|
|
| |
Add Py_SET_TYPE() function to set the type of an object.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of the `unicodedata.is_normalized` function is to answer
the question `str == unicodedata.normalized(form, str)` more
efficiently than writing just that, by using the "quick check"
optimization described in the Unicode standard in UAX #15.
However, it turns out the code doesn't implement the full algorithm
from the standard, and as a result we often miss the optimization and
end up having to compute the whole normalized string after all.
Implement the standard's algorithm. This greatly speeds up
`unicodedata.is_normalized` in many cases where our partial variant
of quick-check had been returning MAYBE and the standard algorithm
returns NO.
At a quick test on my desktop, the existing code takes about 4.4 ms/MB
(so 4.4 ns per byte) when the partial quick-check returns MAYBE and it
has to do the slow normalize-and-compare:
$ build.base/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
50 loops, best of 5: 4.39 msec per loop
With this patch, it gets the answer instantly (58 ns) on the same 1 MB
string:
$ build.dev/python -m timeit -s 'import unicodedata; s = "\uf900"*500000' \
-- 'unicodedata.is_normalized("NFD", s)'
5000000 loops, best of 5: 58.2 nsec per loop
This restores a small optimization that the original version of this
code had for the `unicodedata.normalize` use case.
With this, that case is actually faster than in master!
$ build.base/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 561 usec per loop
$ build.dev/python -m timeit -s 'import unicodedata; s = "\u0338"*500000' \
-- 'unicodedata.normalize("NFD", s)'
500 loops, best of 5: 512 usec per loop
|
|
|
|
|
|
|
|
|
| |
(GH-13464)
Automatically replace
tp_print -> tp_vectorcall_offset
tp_compare -> tp_as_async
tp_reserved -> tp_as_async
|
| |
|
| |
|
|
|
|
|
| |
Hangul composition check boundaries are wrong for the second character
([0x1161, 0x1176) instead of [0x1161, 0x1176]) and third character ((0x11A7, 0x11C3)
instead of [0x11A7, 0x11C3]).
|
|
|
| |
Also, standardize indentation of generated tables.
|
| |
|
|
|
| |
Straightforward. While we're at it, though, strip trailing whitespace from generated tables.
|
| |
|
|\ |
|
| | |
|
| |
| |
| |
| |
| | |
Not completely mechanical since support for East Asian Width changes—emoji
codepoints became Wide—had to be added to unicodedata.
|
|\ \
| |/ |
|
| | |
|
|/ |
|
| |
|
|
|
|
| |
"format units". Updated the documentation to match.
|
|
|
|
| |
instead of types={'type'} to specify the types the converter accepts.
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| |
| |
| |
| | |
overflows. Added few missed PyErr_NoMemory().
|
| | |
|
| |
| |
| |
| | |
parameters
|
| |
| |
| |
| |
| |
| |
| | |
The new syntax is highly human readable while still preventing false
positives. The syntax also extends Python syntax to denote "self" and
positional-only parameters, allowing inspect.Signature objects to be
totally accurate for all supported builtins in Python 3.4.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
annotate text signatures in docstrings, resulting in fewer false
positives. "self" parameters are also explicitly marked, allowing
inspect.Signature() to authoritatively detect (and skip) said parameters.
Issue #20326: Argument Clinic now generates separate checksums for the
input and output sections of the block, allowing external tools to verify
that the input has not changed (and thus the output is not out-of-date).
|
| | |
|
| |
| |
| |
| |
| |
| | |
PyMethodDescr_Type, _PyMethodWrapper_Type, and PyWrapperDescr_Type)
have been modified to provide introspection information for builtins.
Also: many additional Lib, test suite, and Argument Clinic fixes.
|