summaryrefslogtreecommitdiffstats
path: root/.github/CODEOWNERS
diff options
context:
space:
mode:
authorRuben Vorderman <r.h.p.vorderman@lumc.nl>2022-10-17 02:10:58 (GMT)
committerGitHub <noreply@github.com>2022-10-17 02:10:58 (GMT)
commiteae7dad40255bad42e4abce53ff8143dcbc66af5 (patch)
tree7cea56066a6db7c451712f8375034c2d8b8914f4 /.github/CODEOWNERS
parentbb38b39b339191c5fc001c8fbfbc3037c13bc7bb (diff)
downloadcpython-eae7dad40255bad42e4abce53ff8143dcbc66af5.zip
cpython-eae7dad40255bad42e4abce53ff8143dcbc66af5.tar.gz
cpython-eae7dad40255bad42e4abce53ff8143dcbc66af5.tar.bz2
gh-95534: Improve gzip reading speed by 10% (#97664)
Change summary: + There is now a `gzip.READ_BUFFER_SIZE` constant that is 128KB. Other programs that read in 128KB chunks: pigz and cat. So this seems best practice among good programs. Also it is faster than 8 kb chunks. + a zlib._ZlibDecompressor was added. This is the _bz2.BZ2Decompressor ported to zlib. Since the zlib.Decompress object is better for in-memory decompression, the _ZlibDecompressor is hidden. It only makes sense in file decompression, and that is already implemented now in the gzip library. No need to bother the users with this. + The ZlibDecompressor uses the older Cpython arrange_output_buffer functions, as those are faster and more appropriate for the use case. + GzipFile.read has been optimized. There is no longer a `unconsumed_tail` member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. `_add_read_data` has been inlined, as it was just two calls. EDIT: While I am adding improvements anyway, I figured I could add another one-liner optimization now to the python -m gzip application. That read chunks in io.DEFAULT_BUFFER_SIZE previously, but has been updated now to use READ_BUFFER_SIZE chunks.
Diffstat (limited to '.github/CODEOWNERS')
0 files changed, 0 insertions, 0 deletions