| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
(GH-2302) (#2463)
Based on patch by Victor Stinner.
Add private C API function _PyUnicode_AsUnicode() which is similar to
PyUnicode_AsUnicode(), but checks for null characters..
(cherry picked from commit f7eae0adfcd4c50034281b2c69f461b43b68db84)
|
|
|
|
|
|
|
|
|
|
| |
(GH-2285) (GH-2443) (#2448)
And use it instead of PyUnicode_AsWideCharString() if appropriate.
_PyUnicode_AsWideCharString(unicode) is like PyUnicode_AsWideCharString(unicode, NULL), but
raises a ValueError if the wchar_t* string contains null characters.
(cherry picked from commit e613e6add5f07ff6aad5802924596b631b707d2a).
(cherry picked from commit 0edffa3073b551ffeca34952529e7b292f1bd350)
|
|
|
|
|
|
|
|
|
|
| |
Make a non-Py_DEBUG, asserts-enabled build of CPython possible. This means
making sure helper functions are defined when NDEBUG is not defined, not
just when Py_DEBUG is defined.
Also fix a division-by-zero in obmalloc.c that went unnoticed because in
Py_DEBUG mode, elsize is never zero.
(cherry picked from commit a00c3fd12d421e41b769debd7df717d17b0deed5 and 06bb4873d6a9ac303701d08a851d6cd9a51e02a3)
|
|
|
|
| |
Added the documentation for PyUnicode_Translate().
(cherry picked from commit c85a26628ceb9624c96c3064e8b99033c026d8a3)
|
| |
|
| |
|
|
|
|
|
|
| |
The latter function is more readable, faster and doesn't raise exceptions.
Based on patch by Xiang Zhang.
|
|
|
|
|
|
| |
_PyUnicode_EqualToASCIIString.
The latter function is more readable, faster and doesn't raise exceptions.
|
|
|
|
| |
Original patch by Xiang Zhang.
|
|
|
|
|
| |
Also update the classmethod and staticmethod doc strings and comments to
match the RST documentation.
|
| |
|
| |
|
| |
|
|
|
|
| |
This affects documentation, code comments, and a debugging messages.
|
| |
|
|\ |
|
| | |
|
|\ \
| |/
| |
| |
| |
| | |
on Windows instead of silently truncate them.
Removed no longer used _PyUnicode_HasNULChars().
|
|\ \
| |/ |
|
| | |
|
|/ |
|
|
|
|
| |
Patch by Karan Goel.
|
| |
|
|
|
|
| |
Schwab.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
_PyUnicode_CompareWithId() is faster than PyUnicode_CompareWithASCIIString()
when both strings are equal and interned.
Add also _PyId_builtins identifier for "builtins" common string.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add also min_char attribute to _PyUnicodeWriter structure (currently unused)
* _PyUnicodeWriter_Init() has no more argument (except the writer itself):
min_length and overallocate must be set explicitly
* In error handlers, only enable overallocation if the replacement string
is longer than 1 character
* CJK decoders don't use overallocation anymore
* Set min_length, instead of preallocating memory using
_PyUnicodeWriter_Prepare(), in many decoders
* _PyUnicode_DecodeUnicodeInternal() checks for integer overflow
|
|
|
|
|
|
| |
the legacy Py_UNICODE API.
Add also a new _PyUnicodeWriter_WriteChar() function.
|
|
|
|
|
|
|
|
|
| |
Write a function to enable more optimizations:
* If the substring is the whole string and overallocation is disabled, just
keep a reference to the string, don't copy characters
* Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
possible
|
|
|
|
|
|
|
| |
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
|
|
|
|
| |
Patch written by Serhiy Storchaka.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Simplify the code: replace 4 steps with one unique step using the
_PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
store intermediate results which require to allocate an array of pointers on
the heap.
* Use the _PyUnicodeWriter API for speed (and its convinient API):
overallocate the buffer to reduce the number of "realloc()"
* Implement "width" and "precision" in Python, don't rely on sprintf(). It
avoids to need of a temporary buffer allocated on the heap: only use a small
buffer allocated in the stack.
* Add _PyUnicodeWriter_WriteCstr() function
* Split PyUnicode_FromFormatV() into two functions: add
unicode_fromformat_arg().
* Inline parse_format_flags(): the format of an argument is now only parsed
once, it's no more needed to have a subfunction.
* Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
search the next "%" and copy the substring in one chunk, instead of copying
character per character.
|
|\ |
|
| | |
|
|/
|
|
| |
It was already implemented in PyUnicode_RichCompare()
|
|
|
|
| |
Patch by Serhiy Storchaka.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and str.format(args)
* Formatting string, int, float and complex use the _PyUnicodeWriter API. It
avoids a temporary buffer in most cases.
* Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just
keep a reference to the string if the output is only composed of one string
* Disable overallocation when formatting the last argument of str%args and
str.format(args)
* Overallocation allocates at least 100 characters: add min_length attribute
to the _PyUnicodeWriter structure
* Add new private functions: _PyUnicode_FastCopyCharacters(),
_PyUnicode_FastFill() and _PyUnicode_FromASCII()
The speed up is around 20% in average.
|
| |
|
|
|
|
|
| |
Add checks in PyUnicode_WriteChar() and convert PyUnicode_New() assertion to a
test raising a Python exception.
|
|
|
|
|
|
|
|
|
|
|
| |
* Decode thousands separator and decimal point using PyUnicode_DecodeLocale()
(from the locale encoding), instead of decoding them implicitly from latin1
* Remove _PyUnicode_InsertThousandsGroupingLocale(), it was not used
* Change _PyUnicode_InsertThousandsGrouping() API to return the maximum
character if unicode is NULL
* Replace MIN/MAX macros by Py_MIN/Py_MAX
* stringlib/undef.h undefines STRINGLIB_IS_UNICODE
* stringlib/localeutil.h only supports Unicode
|
| |
|
|\
| |
| |
| |
| |
| | |
in the file name.
Patch by Hynek Schlawack.
|
| |
| |
| |
| |
| |
| | |
in the file name.
Patch by Hynek Schlawack.
|
| |
| |
| |
| |
| | |
I had to move the static identifier code from unicodeobject.h to object.h in
order for this to work.
|
| | |
|