summaryrefslogtreecommitdiffstats
path: root/Objects/bytesobject.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-105235: Prevent reading outside buffer during mmap.find() (#105252)Dennis Sweeney2023-07-131-2/+19
| | | | | | | * Add a special case for s[-m:] == p in _PyBytes_Find * Add tests for _PyBytes_Find * Make sure that start <= end in mmap.find
* gh-104922: remove PY_SSIZE_T_CLEAN (#106315)Inada Naoki2023-07-021-2/+0
|
* fix typos (#106247)Inada Naoki2023-06-301-2/+2
| | | Most typos are in comments, but two typos are in docstring.
* gh-92536: Remove PyUnicode_READY() calls (#105210)Victor Stinner2023-06-011-2/+0
| | | | Since Python 3.12, PyUnicode_READY() does nothing and always returns 0.
* gh-104018: remove unused format "z" handling in string formatfloat() (#104107)John Belmonte2023-05-071-3/+0
| | | This is a cleanup overlooked in PR #104033.
* gh-104018: disallow "z" format specifier in %-format of byte strings (GH-104033)John Belmonte2023-05-011-1/+0
| | | | | | | | | | | PEP-0682 specified that %-formatting would not support the "z" specifier, but it was unintentionally allowed for bytes. This PR makes use of the "z" flag an error for %-formatting in a bytestring. Issue: #104018 --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* gh-94673: Ensure Builtin Static Types are Readied Properly (gh-103940)Eric Snow2023-04-271-19/+0
| | | There were cases where we do unnecessary work for builtin static types. This also simplifies some work necessary for a per-interpreter GIL.
* gh-102304: Move the Total Refcount to PyInterpreterState (gh-102545)Eric Snow2023-03-211-1/+1
| | | | | Moving it valuable with a per-interpreter GIL. However, it is also useful without one, since it allows us to identify refleaks within a single interpreter or where references are escaping an interpreter. This becomes more important as we move the obmalloc state to PyInterpreterState. https://github.com/python/cpython/issues/102304
* gh-102304: Consolidate Direct Usage of _Py_RefTotal (gh-102514)Eric Snow2023-03-081-5/+4
| | | | | This simplifies further changes to _Py_RefTotal (e.g. make it atomic or move it to PyInterpreterState). https://github.com/python/cpython/issues/102304
* gh-101765: Fix SystemError / segmentation fault in iter `__reduce__` when ↵Ionite2023-02-241-3/+8
| | | | internal access of `builtins.__dict__` exhausts the iterator (#101769)
* gh-101056: Fix memory leak in `formatfloat()` in `bytesobject.c` (#101057)Nikita Sobolev2023-01-161-1/+3
|
* bpo-15999: Accept arbitrary values for boolean parameters. (#15609)Serhiy Storchaka2022-12-031-2/+2
| | | builtins and extension module functions and methods that expect boolean values for parameters now accept any Python object rather than just a bool or int type. This is more consistent with how native Python code itself behaves.
* gh-99537: Use Py_SETREF() function in C code (#99657)Victor Stinner2022-11-221-3/+1
| | | | | | | | | | | | | | | Fix potential race condition in code patterns: * Replace "Py_DECREF(var); var = new;" with "Py_SETREF(var, new);" * Replace "Py_XDECREF(var); var = new;" with "Py_XSETREF(var, new);" * Replace "Py_CLEAR(var); var = new;" with "Py_XSETREF(var, new);" Other changes: * Replace "old = var; var = new; Py_DECREF(var)" with "Py_SETREF(var, new);" * Replace "old = var; var = new; Py_XDECREF(var)" with "Py_XSETREF(var, new);" * And remove the "old" variable.
* gh-99300: Use Py_NewRef() in Objects/ directory (#99332)Victor Stinner2022-11-101-34/+17
| | | | Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Objects/ directory.
* GH-93207: Remove HAVE_STDARG_PROTOTYPES configure check for stdarg.h (#93215)Kumar Aditya2022-05-271-4/+0
|
* gh-89653: Use int type for Unicode kind (#92704)Victor Stinner2022-05-131-1/+1
| | | | Use the same type that PyUnicode_FromKindAndData() kind parameter type (public C API): int.
* Use static inline function Py_EnterRecursiveCall() (#91988)Victor Stinner2022-05-041-1/+1
| | | | | | | | | | | | | | | | Currently, calling Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() may use a function call or a static inline function call, depending if the internal pycore_ceval.h header file is included or not. Use a different name for the static inline function to ensure that the static inline function is always used in Python internals for best performance. Similar approach than PyThreadState_GET() (function call) and _PyThreadState_GET() (static inline function). * Rename _Py_EnterRecursiveCall() to _Py_EnterRecursiveCallTstate() * Rename _Py_LeaveRecursiveCall() to _Py_LeaveRecursiveCallTstate() * pycore_ceval.h: Rename Py_EnterRecursiveCall() to _Py_EnterRecursiveCall() and Py_LeaveRecursiveCall() and _Py_LeaveRecursiveCall()
* gh-81548: Deprecate octal escape sequences with value larger than 0o377 ↵Serhiy Storchaka2022-04-301-5/+24
| | | | (GH-91668)
* gh-91020: Add `PyBytes_Type.tp_alloc` for subclass (GH-91686)Inada Naoki2022-04-201-1/+20
|
* bpo-45995: add "z" format specifer to coerce negative 0 to zero (GH-30049)John Belmonte2022-04-111-2/+9
| | | | | | | | Add "z" format specifier to coerce negative 0 to zero. See https://github.com/python/cpython/issues/90153 (originally https://bugs.python.org/issue45995) for discussion. This covers `str.format()` and f-strings. Old-style string interpolation is not supported. Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
* bpo-47070: Add _PyBytes_Repeat() (GH-31999)Pieter Eendebak2022-03-281-17/+29
| | | Use it where appropriate: the repeat functions of `array.array`, `bytes`, `bytearray`, and `str`.
* bpo-47116: use _PyLong_FromUnsignedChar instead of PyLong_FromLong (GH-32110)Pieter Eendebak2022-03-261-2/+2
|
* bpo-47012: speed up iteration of bytes and bytearray (GH-31867)Kumar Aditya2022-03-231-1/+1
|
* bpo-46864: Deprecate PyBytesObject.ob_shash. (GH-31598)Inada Naoki2022-03-061-0/+18
|
* bpo-46848: Move _PyBytes_Find() to internal C API (GH-31642)Victor Stinner2022-03-021-0/+1
| | | | | | Move _PyBytes_Find() and _PyBytes_ReverseFind() functions to the internal C API. bytesobject.c now includes pycore_bytesobject.h.
* bpo-46848: Use stringlib/fastsearch in mmap (GH-31625)Dennis Sweeney2022-03-021-0/+18
| | | Speed up mmap.find(). Add _PyBytes_Find() and _PyBytes_ReverseFind().
* bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized ↵Eric Snow2022-02-081-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | global objects. (gh-30928) We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules. The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings). https://bugs.python.org/issue46541#msg411799 explains the rationale for this change. The core of the change is in: * (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros * Include/internal/pycore_runtime_init.h - added the static initializers for the global strings * Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState * Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config. The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *. The following are not changed (yet): * stop using _Py_IDENTIFIER() in the stdlib modules * (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API * (maybe) intern the strings during runtime init https://bugs.python.org/issue46541
* bpo-46670: Define all macros for stringlib (GH-31176)Victor Stinner2022-02-071-0/+1
| | | | | bytesobject.c, bytearrayobject.c and unicodeobject.c now define all macros used by stringlib, to avoid using undefined macros. Fix "gcc -Wundef" warnings.
* bpo-46417: Add missing types of _PyTypes_InitTypes() (GH-30749)Victor Stinner2022-01-211-1/+1
| | | | | | | | | Add types removed by mistake by the commit adding _PyTypes_FiniTypes(). Move also PyBool_Type at the end, since it depends on PyLong_Type. PyBytes_Type and PyUnicode_Type no longer depend explicitly on PyBaseObject_Type: it's the default of PyType_Ready().
* bpo-45953: Statically allocate and initialize global bytes objects. (gh-30096)Eric Snow2022-01-111-78/+14
| | | | | The empty bytes object (b'') and the 256 one-character bytes objects were allocated at runtime init. Now we statically allocate and initialize them. https://bugs.python.org/issue45953
* bpo-46008: Make runtime-global object/type lifecycle functions and state ↵Eric Snow2021-12-091-1/+21
| | | | | | | | | | | | consistent. (gh-29998) This change is strictly renames and moving code around. It helps in the following ways: * ensures type-related init functions focus strictly on one of the three aspects (state, objects, types) * passes in PyInterpreterState * to all those functions, simplifying work on moving types/objects/state to the interpreter * consistent naming conventions help make what's going on more clear * keeping API related to a type in the corresponding header file makes it more obvious where to look for it https://bugs.python.org/issue46008
* bpo-35134: Add Include/cpython/longobject.h (GH-29044)Victor Stinner2021-10-191-0/+1
| | | | | | | | | | Move Include/longobject.h non-limited API to a new Include/cpython/longobject.h header file. Move the following definitions to the internal C API: * _PyLong_DigitValue * _PyLong_FormatAdvancedWriter() * _PyLong_FormatWriter()
* bpo-45434: Remove pystrhex.h header file (GH-28923)Victor Stinner2021-10-131-1/+1
| | | | | | | | | | | | | | | Move Include/pystrhex.h to Include/internal/pycore_strhex.h. The header file only contains private functions. The following C extensions are now built with Py_BUILD_CORE_MODULE macro defined to get access to the internal C API: * _blake2 * _hashopenssl * _md5 * _sha1 * _sha3 * _ssl * binascii
* bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895)Victor Stinner2021-10-121-0/+1
| | | | | | | * Move _PyObject_CallNoArgs() to pycore_call.h (internal C API). * _ssl, _sqlite and _testcapi extensions now call the public PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs(). * _lsprof extension is now built with Py_BUILD_CORE_MODULE macro defined to get access to internal _PyObject_CallNoArgs().
* bpo-45439: Rename _PyObject_CallNoArg() to _PyObject_CallNoArgs() (GH-28891)Victor Stinner2021-10-111-2/+2
| | | | | Fix typo in the private _PyObject_CallNoArg() function name: rename it to _PyObject_CallNoArgs() to be consistent with the public function PyObject_CallNoArgs().
* Fix bytes.__bytes__ to not truncate at a zero byte (GH-27902)Mark Dickinson2021-08-231-1/+1
|
* bpo-24234: Implement bytes.__bytes__ (GH-27901)Dong-hee Na2021-08-231-0/+20
|
* bpo-42128: Structural Pattern Matching (PEP 634) (GH-22917)Brandt Bucher2021-02-261-1/+2
| | | | | Co-authored-by: Guido van Rossum <guido@python.org> Co-authored-by: Talin <viridia@gmail.com> Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
* bpo-43268: Pass interp rather than tstate to internal functions (GH-24580)Victor Stinner2021-02-191-4/+4
| | | | | | | | | | | | | | | Pass the current interpreter (interp) rather than the current Python thread state (tstate) to internal functions which only use the interpreter. Modified functions: * _PyXXX_Fini() and _PyXXX_ClearFreeList() functions * _PyEval_SignalAsyncExc(), make_pending_calls() * _PySys_GetObject(), sys_set_object(), sys_set_object_id(), sys_set_object_str() * should_audit(), set_flags_from_config(), make_flags() * _PyAtExit_Call() * init_stdio_encoding() * etc.
* bpo-42431: Fix outdated bytes comments (GH-23458)Serhiy Storchaka2020-12-031-21/+9
| | | | Also move definitions of internal macros F_LJUST etc to private header.
* bpo-42519: Replace PyObject_MALLOC() with PyObject_Malloc() (GH-23587)Victor Stinner2020-12-011-4/+4
| | | | | | | | | No longer use deprecated aliases to functions: * Replace PyObject_MALLOC() with PyObject_Malloc() * Replace PyObject_REALLOC() with PyObject_Realloc() * Replace PyObject_FREE() with PyObject_Free() * Replace PyObject_Del() with PyObject_Free() * Replace PyObject_DEL() with PyObject_Free()
* bpo-42435: Speed up comparison of bytes and bytearray object (GH--23461)Serhiy Storchaka2020-11-221-21/+4
| | | | | * Speed up comparison of bytes objects with non-bytes objects when option -b is specified. * Speed up comparison of bytarray objects with non-buffer object.
* bpo-41974: Remove complex.__float__, complex.__floordiv__, etc (GH-22593)Serhiy Storchaka2020-10-091-1/+1
| | | | | | Remove complex special methods __int__, __float__, __floordiv__, __mod__, __divmod__, __rfloordiv__, __rmod__ and __rdivmod__ which always raised a TypeError.
* A (very) slight speed improvement for iterating over bytes (#21705)Guido van Rossum2020-08-031-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | My mentee @xvxvxvxvxv noticed that iterating over array.array is slightly faster than iterating over bytes. Looking at the source I observed that arrayiter_next() calls `getitem(ao, it->index++)` wheras striter_next() uses the idiom (paraphrased) item = PyLong_FromLong(seq->ob_sval[it->it_index]); if (item != NULL) ++it->it_next; return item; I'm not 100% sure but I think that the second version has fewer opportunity for the CPU to overlap the `index++` operation with the rest of the code (which in both cases involves a call). So here I am optimistically incrementing the index -- if the PyLong_FromLong() call fails, this will leave the iterator pointing at the next byte, but honestly I doubt that anyone would seriously consider resuming use of the iterator after that kind of failure (it would have to be a MemoryError). And the author of arrayiter_next() made the same consideration (or never ever gave it a thought :-). With this, a loop like for _ in b: pass is now slightly *faster* than the same thing over an equivalent array, rather than slightly *slower* (in both cases a few percent).
* bpo-41334: Convert constructors of str, bytes and bytearray to Argument ↵Serhiy Storchaka2020-07-201-46/+40
| | | | Clinic (GH-21535)
* bpo-37999: Simplify the conversion code for %c, %d, %x, etc. (GH-20437)Serhiy Storchaka2020-06-291-22/+9
| | | | Since PyLong_AsLong() no longer use __int__, explicit call of PyNumber_Index() before it is no longer needed.
* bpo-40521: Optimize PyBytes_FromStringAndSize(str, 0) (GH-21142)Victor Stinner2020-06-251-27/+64
| | | | | | | | | | | | | | | | Always create the empty bytes string singleton. Optimize PyBytes_FromStringAndSize(str, 0): it no longer has to check if the empty string singleton was created or not, it is always available. Add functions: * _PyBytes_Init() * bytes_get_empty(), bytes_new_empty() * bytes_create_empty_string_singleton() * unicode_create_empty_string_singleton() _Py_unicode_state: rename empty structure member to empty_string.
* bpo-40521: Make bytes singletons per interpreter (GH-21074)Victor Stinner2020-06-231-27/+55
| | | | | | Each interpreter now has its own empty bytes string and single byte character singletons. Replace STRINGLIB_EMPTY macro with STRINGLIB_GET_EMPTY() macro.
* bpo-40989: PyObject_INIT() becomes an alias to PyObject_Init() (GH-20901)Victor Stinner2020-06-151-6/+9
| | | | | | | | | | | | | | The PyObject_INIT() and PyObject_INIT_VAR() macros become aliases to, respectively, PyObject_Init() and PyObject_InitVar() functions. Rename _PyObject_INIT() and _PyObject_INIT_VAR() static inline functions to, respectively, _PyObject_Init() and _PyObject_InitVar(), and move them to pycore_object.h. Remove their return value: their return type becomes void. The _datetime module is now built with the Py_BUILD_CORE_MODULE macro defined. Remove an outdated comment on _Py_tracemalloc_config.
* bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781)Victor Stinner2020-06-101-14/+16
| | | | | | | The PEP 353, written in 2005, introduced PY_FORMAT_SIZE_T. Python no longer supports macOS 10.4 and Visual Studio 2010, but requires more recent macOS and Visual Studio versions. In 2020 with Python 3.10, it is now safe to use directly "%zu" to format size_t and "%zi" to format Py_ssize_t.