summaryrefslogtreecommitdiffstats
path: root/Objects
Commit message (Collapse)AuthorAgeFilesLines
...
* | | Modify _PyBytes_DecodeEscapeRecode() to use _PyBytesAPIVictor Stinner2015-10-141-58/+73
| | | | | | | | | | | | | | | | | | | | | * Don't overallocate by 400% when recode is needed: only overallocate on demand using _PyBytesWriter. * Use _PyLong_DigitValue to convert hexadecimal digit to int * Create _PyBytes_DecodeEscapeRecode() subfunction
* | | Fix compiler warnings (uninitialized variables), false alarms in factVictor Stinner2015-10-141-4/+2
| | |
* | | _PyBytesWriter_Alloc(): only use 10 bytes of the small buffer in debug mode toVictor Stinner2015-10-141-1/+13
| | | | | | | | | | | | enhance code to detect buffer under- and overflow.
* | | Issue #25401: Remove now unused hex_digit_to_int() functionVictor Stinner2015-10-141-16/+0
| | |
* | | Optimize bytes.fromhex() and bytearray.fromhex()Victor Stinner2015-10-142-94/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue #25401: Optimize bytes.fromhex() and bytearray.fromhex(): they are now between 2x and 3.5x faster. Changes: * Use a fast-path working on a char* string for ASCII string * Use a slow-path for non-ASCII string * Replace slow hex_digit_to_int() function with a O(1) lookup in _PyLong_DigitValue precomputed table * Use _PyBytesWriter API to handle the buffer * Add unit tests to check the error position in error messages
* | | Optimize bytearray % argsVictor Stinner2015-10-142-35/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue #25399: Don't create temporary bytes objects: modify _PyBytes_Format() to create work directly on bytearray objects. * Rename _PyBytes_Format() to _PyBytes_FormatEx() just in case if something outside CPython uses it * _PyBytes_FormatEx() now uses (char*, Py_ssize_t) for the input string, so bytearray_format() doesn't need tot create a temporary input bytes object * Add use_bytearray parameter to _PyBytes_FormatEx() which is passed to _PyBytesWriter, to create a bytearray buffer instead of a bytes buffer Most formatting operations are now between 2.5 and 5 times faster.
* | | Add use_bytearray attribute to _PyBytesWriterVictor Stinner2015-10-141-28/+65
| | | | | | | | | | | | | | | Issue #25399: Add a new use_bytearray attribute to _PyBytesWriter to use a bytearray buffer, instead of using a bytes object.
* | | Fix long_format_binary()Victor Stinner2015-10-141-1/+1
| | | | | | | | | | | | Issue #25399: Fix long_format_binary(), allocate bytes for the bytes writer.
* | | Rewrite PyBytes_FromFormatV() using _PyBytesWriter APIVictor Stinner2015-10-131-171/+165
| | | | | | | | | | | | | | | | | | | | | * Add much more unit tests on PyBytes_FromFormatV() * Remove the first loop to compute the length of the output string * Use _PyBytesWriter to handle the bytes buffer, use overallocation * Cleanup the code to make simpler and easier to review
* | | Issue #25353: Optimize unicode escape and raw unicode escape encoders to useVictor Stinner2015-10-121-44/+69
| | | | | | | | | | | | the new _PyBytesWriter API.
* | | Fix compilation error in _PyBytesWriter_WriteBytes() on WindowsVictor Stinner2015-10-121-1/+3
| | |
* | | Writer APIs: use empty string singletonsVictor Stinner2015-10-122-18/+32
| | | | | | | | | | | | | | | Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the empty bytes/Unicode string if the string is empty.
* | | Relax _PyBytesWriter APIVictor Stinner2015-10-121-8/+7
| | | | | | | | | | | | | | | | | | | | | Don't require _PyBytesWriter pointer to be a "char *". Same change for _PyBytesWriter_WriteBytes() parameter. For example, binascii uses "unsigned char*".
* | | Issue #24164: Objects that need calling ``__new__`` with keyword arguments,Serhiy Storchaka2015-10-101-13/+3
| | | | | | | | | | | | can now be pickled using pickle protocols older than protocol version 4.
* | | Issue #25349: Add fast path for b'%c' % intVictor Stinner2015-10-091-10/+15
| | | | | | | | | | | | Optimize also %% formater.
* | | Issue #25349: Optimize bytes % intVictor Stinner2015-10-092-27/+129
| | | | | | | | | | | | | | | | | | | | | | | | Optimize bytes.__mod__(args) for integere formats: %d (%i, %u), %o, %x and %X. _PyBytesWriter is now used to format directly the integer into the writer buffer, instead of using a temporary bytes object. Formatting is between 30% and 50% faster on a microbenchmark.
* | | Optimize error handlers of ASCII and Latin1 encoders when the replacementVictor Stinner2015-10-092-43/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path
* | | Add _PyBytesWriter_WriteBytes() to factorize the codeVictor Stinner2015-10-093-16/+28
| | |
* | | _PyBytesWriter: simplify code to avoid "prealloc" parametersVictor Stinner2015-10-093-47/+47
| | | | | | | | | | | | | | | Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().
* | | _PyBytesWriter: rename size attribute to min_sizeVictor Stinner2015-10-091-7/+7
| | |
* | | Issue #25349: Optimize bytes % args using the new private _PyBytesWriter APIVictor Stinner2015-10-091-59/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Thanks to the _PyBytesWriter API, output smaller than 512 bytes are allocated on the stack and so avoid calling _PyBytes_Resize(). Because of that, change the default buffer size to fmtcnt instead of fmtcnt+100. * Rely on _PyBytesWriter algorithm to overallocate the buffer instead of using a custom code. For example, _PyBytesWriter uses a different overallocation factor (25% or 50%) depending on the platform to get best performances. * Disable overallocation for the last write. * Replace C loops to fill characters with memset() * Add also many comments to _PyBytes_Format() * Remove unused FORMATBUFLEN constant * Avoid the creation of a temporary bytes object when formatting a floating point number (when no custom formatting option is used) * Fix also reference leaks on error handling * Use Py_MEMCPY() to copy bytes between two formatters (%)
* | | Issue #25318: cleanup code _PyBytesWriterVictor Stinner2015-10-091-17/+17
| | | | | | | | | | | | | | | | | | Rename "stack buffer" to "small buffer". Add also an assertion in _PyBytesWriter_GetPos().
* | | Issue #25318: Fix backslashreplace()Victor Stinner2015-10-091-1/+1
| | | | | | | | | | | | Fix code to estimate the needed space.
* | | Issue #25318: Avoid sprintf() in backslashreplace()Victor Stinner2015-10-091-7/+18
| | | | | | | | | | | | | | | | | | Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors(). Add also unit tests for non-BMP characters.
* | | Issue #25318: Fix compilation errorVictor Stinner2015-10-091-1/+1
| | | | | | | | | | | | Replace "#if Py_DEBUG" with "#ifdef Py_DEBUG".
* | | Issue #25318: Move _PyBytesWriter to bytesobject.cVictor Stinner2015-10-082-210/+193
| | | | | | | | | | | | Declare also the private API in bytesobject.h.
* | | Optimize backslashreplace error handlerVictor Stinner2015-10-082-51/+160
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.
* | | Issue #25318: Add _PyBytesWriter APIVictor Stinner2015-10-082-132/+268
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.
* | | Merge typo fixes from 3.5Martin Panter2015-10-073-3/+3
|\ \ \ | |/ /
| * | More typos in 3.5 documentation and commentsMartin Panter2015-10-072-2/+2
| | |
| * | Merge typo fixes from 3.4 into 3.5Martin Panter2015-10-072-2/+2
| |\ \ | | |/
| | * Various minor typos in documentation and commentsMartin Panter2015-10-072-2/+2
| | |
* | | merge 3.5 (closes #24806)Benjamin Peterson2015-10-071-6/+6
|\ \ \ | |/ /
| * | merge 3.4 (#24806)Benjamin Peterson2015-10-071-6/+6
| |\ \ | | |/
| | * prevent unacceptable bases from becoming bases through multiple inheritance ↵Benjamin Peterson2015-10-071-6/+6
| | | | | | | | | | | | (#24806)
* | | Issue #25301: Fix compatibility with ISO C90Victor Stinner2015-10-051-1/+5
| | |
* | | Issue #25301: The UTF-8 decoder is now up to 15 times as fast for errorVictor Stinner2015-10-051-9/+39
| | | | | | | | | | | | handlers: ``ignore``, ``replace`` and ``surrogateescape``.
* | | Fix _PyUnicodeWriter_PrepareKind()Victor Stinner2015-10-021-7/+18
| | | | | | | | | | | | | | | | | | Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that _PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the buffer.
* | | Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:Serhiy Storchaka2015-10-021-9/+12
|\ \ \ | |/ / | | | | | | | | | | | | | | | 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
| * | Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:Serhiy Storchaka2015-10-021-9/+12
| |\ \ | | |/ | | | | | | | | | | | | | | | 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
| | * Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:Serhiy Storchaka2015-10-021-9/+12
| | | | | | | | | | | | | | | 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate.
* | | Issue #24483: C implementation of functools.lru_cache() now calculates key'sSerhiy Storchaka2015-10-021-0/+37
|\ \ \ | |/ / | | | | | | hash only once.
| * | Issue #24483: C implementation of functools.lru_cache() now calculates key'sSerhiy Storchaka2015-10-021-0/+37
| | | | | | | | | | | | hash only once.
* | | Make _PyUnicode_TranslateCharmap() symbol privateVictor Stinner2015-10-011-1/+1
| | | | | | | | | | | | unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().
* | | Issue #25267: The UTF-8 encoder is now up to 75 times as fast for errorVictor Stinner2015-10-012-53/+101
| | | | | | | | | | | | | | | handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.
* | | (Merge 3.5) Issue #25182: Fix compilation on WindowsVictor Stinner2015-09-301-3/+6
|\ \ \ | |/ /
| * | (Merge 3.4) Issue #25182: Fix compilation on WindowsVictor Stinner2015-09-301-3/+6
| |\ \ | | |/
| | * Issue #25182: Fix compilation on WindowsVictor Stinner2015-09-301-3/+6
| | | | | | | | | | | | Restore also errno value before calling PyErr_SetFromErrno().
* | | Issue #25182: The stdprinter (used as sys.stderr before the io module isSerhiy Storchaka2015-09-301-4/+21
|\ \ \ | |/ / | | | | | | imported at startup) now uses the backslashreplace error handler.
| * | Issue #25182: The stdprinter (used as sys.stderr before the io module isSerhiy Storchaka2015-09-301-4/+21
| |\ \ | | |/ | | | | | | imported at startup) now uses the backslashreplace error handler.