summaryrefslogtreecommitdiffstats
path: root/Lib/gzip.py
Commit message (Collapse)AuthorAgeFilesLines
* gh-95534: Improve gzip reading speed by 10% (#97664)Ruben Vorderman2022-10-171-12/+12
| | | | | | | | | Change summary: + There is now a `gzip.READ_BUFFER_SIZE` constant that is 128KB. Other programs that read in 128KB chunks: pigz and cat. So this seems best practice among good programs. Also it is faster than 8 kb chunks. + a zlib._ZlibDecompressor was added. This is the _bz2.BZ2Decompressor ported to zlib. Since the zlib.Decompress object is better for in-memory decompression, the _ZlibDecompressor is hidden. It only makes sense in file decompression, and that is already implemented now in the gzip library. No need to bother the users with this. + The ZlibDecompressor uses the older Cpython arrange_output_buffer functions, as those are faster and more appropriate for the use case. + GzipFile.read has been optimized. There is no longer a `unconsumed_tail` member to write back to padded file. This is instead handled by the ZlibDecompressor itself, which has an internal buffer. `_add_read_data` has been inlined, as it was just two calls. EDIT: While I am adding improvements anyway, I figured I could add another one-liner optimization now to the python -m gzip application. That read chunks in io.DEFAULT_BUFFER_SIZE previously, but has been updated now to use READ_BUFFER_SIZE chunks.
* gh-94196: Remove gzip.GzipFile.filename attribute (#94197)Victor Stinner2022-06-241-8/+0
| | | | | | gzip: Remove the filename attribute of gzip.GzipFile, deprecated since Python 2.6, use the name attribute instead. In write mode, the filename attribute added '.gz' file extension if it was not present.
* gh-90839: Forward gzip.compress() compresslevel to zlib (gh-31215)Ilya Leoshkevich2022-04-121-1/+2
|
* bpo-45507: EOFErrors should be thrown for truncated gzip members (GH-29029)Ruben Vorderman2021-11-191-0/+3
|
* bpo-45475: Revert `__iter__` optimization for GzipFile, BZ2File, and ↵Inada Naoki2021-10-191-4/+0
| | | | | LZMAFile. (GH-29016) This reverts commit d2a8e69c2c605fbaa3656a5f99aa8d295f74c80e.
* bpo-43613: Faster implementation of gzip.compress and gzip.decompress (GH-27941)Ruben Vorderman2021-09-021-53/+108
| | | Co-authored-by: Łukasz Langa <lukasz@langa.pl>
* bpo-44439: BZ2File.write() / LZMAFile.write() handle buffer protocol ↵Ma Lin2021-06-221-1/+1
| | | | | | | correctly (GH-26764) No longer use len() to get the length of the input data. For some buffer protocol objects, the length obtained by using len() is wrong.
* Fix typo in comment (GH-26162)Ashwin Ramaswami2021-05-161-1/+1
|
* bpo-43787: Add __iter__ to GzipFile, BZ2File, and LZMAFile (GH-25353)Inada Naoki2021-04-131-0/+4
|
* bpo-43510: Implement PEP 597 opt-in EncodingWarning. (GH-19481)Inada Naoki2021-03-291-0/+1
| | | | | | | | | | | See [PEP 597](https://www.python.org/dev/peps/pep-0597/). * Add `-X warn_default_encoding` and `PYTHONWARNDEFAULTENCODING`. * Add EncodingWarning * Add io.text_encoding() * open(), TextIOWrapper() emits EncodingWarning when encoding is omitted and warn_default_encoding is enabled. * _pyio.TextIOWrapper() uses UTF-8 as fallback default encoding used when failed to import locale module. (used during building Python) * bz2, configparser, gzip, lzma, pathlib, tempfile modules use io.text_encoding(). * What's new entry
* bpo-43317: Use io.DEFAULT_BUFFER_SIZE instead of 1024 in gzip CLI (#24645)Ruben Vorderman2021-02-261-1/+1
| | | This improves the performance slightly.
* bpo-43316: gzip: Fix sys.exit() usage. (GH-24652)Inada Naoki2021-02-261-1/+1
|
* bpo-43316: gzip: CLI uses non-zero return code on error. (GH-24647)Ruben Vorderman2021-02-251-2/+1
| | | | Exit code is now 1 instead of 0. A message is printed to stderr instead of stdout. This is the proper behaviour for a tool that can be used in scripts.
* bpo-39389: gzip: fix compression level metadata (GH-18077)William Chargin2020-01-211-3/+9
| | | | | | As described in RFC 1952, section 2.3.1, the XFL (eXtra FLags) byte of a gzip member header should indicate whether the DEFLATE algorithm was tuned for speed or compression ratio. Prior to this patch, archives emitted by the `gzip` module always indicated maximum compression.
* bpo-28286: Deprecate opening GzipFile for writing implicitly. (GH-16417)Serhiy Storchaka2019-11-161-0/+8
| | | | Always specify the mode argument for writing.
* bpo-6584: Add a BadGzipFile exception to the gzip module. (GH-13022)Zackery Spytz2019-05-131-6/+11
| | | | | Co-Authored-By: Filip Gruszczyński <gruszczy@gmail.com> Co-Authored-By: Michele Orrù <maker@tumbolandia.net>
* fix typo in gzip.py (GH-12928)Maximilian Nöthe2019-04-241-1/+1
|
* bpo-34898: Add mtime parameter to gzip.compress(). (GH-9704)guoci2018-11-071-2/+2
| | | | | Without setting mtime, time.time() will be used as the timestamp which will end up in the compressed data and each invocation of the compress() function will vary over time.
* bpo-34969: Add --fast, --best on the gzip CLI (GH-9833)Stéphane Wirtel2018-11-031-5/+22
|
* bpo-23596: Use argparse for the command line of gzip (GH-9781)Stéphane Wirtel2018-10-091-13/+12
| | | | Co-authored-by: Antony Lee <anntzer.lee@gmail.com>
* Replace KB unit with KiB (#4293)Victor Stinner2017-11-081-1/+1
| | | | | | | | | | | kB (*kilo* byte) unit means 1000 bytes, whereas KiB ("kibibyte") means 1024 bytes. KB was misused: replace kB or KB with KiB when appropriate. Same change for MB and GB which become MiB and GiB. Change the output of Tools/iobench/iobench.py. Round also the size of the documentation from 5.5 MB to 5 MiB.
* Issue #28227: gzip now supports pathlibBerker Peksag2016-10-021-1/+3
| | | | Patch by Ethan Furman.
* Use sequence repetition instead of bytes constructor with integer argument.Serhiy Storchaka2016-09-111-2/+2
|
* Fix spelling (inital), grammar (may translates) in documentation, commentsMartin Panter2016-04-191-1/+1
|
* Issue #22341: Drop Python 2 workaround and document CRC initial valueMartin Panter2015-12-111-4/+4
| | | | Also align the parameter naming in binascii to be consistent with zlib.
* Issue #23529: Limit the size of decompressed data when reading fromAntoine Pitrou2015-04-101-295/+188
| | | | | | | | GzipFile, BZ2File or LZMAFile. This defeats denial of service attacks using compressed bombs (i.e. compressed payloads which decompress to a huge size). Patch by Martin Panter and Nikolaus Rath.
* Issue #23865: close() methods in multiple modules now are idempotent and moreSerhiy Storchaka2015-04-101-12/+14
|\ | | | | | | | | robust at shutdown. If needs to release multiple resources, they are released even if errors are occured.
| * Issue #23865: close() methods in multiple modules now are idempotent and moreSerhiy Storchaka2015-04-101-12/+14
| | | | | | | | | | robust at shutdown. If needs to release multiple resources, they are released even if errors are occured.
| * Issue #21560: An attempt to write a data of wrong type no longer causeSerhiy Storchaka2015-03-231-2/+2
| | | | | | | | GzipFile corruption. Original patch by Wolfgang Maier.
* | Issue #23688: Added support of arbitrary bytes-like objects and avoidedSerhiy Storchaka2015-03-231-8/+11
|/ | | | | unnecessary copying of memoryview in gzip.GzipFile.write(). Original patch by Wolfgang Maier.
* Issue #20875: Merge from 3.3Ned Deily2014-03-091-1/+1
|\
| * Issue #20875: Prevent possible gzip "'read' is not defined" NameError.Ned Deily2014-03-091-1/+1
| | | | | | | | Patch by Claudiu Popa.
* | Issue #19222: Add support for the 'x' mode to the gzip module.Nadeem Vawda2013-10-181-7/+7
| | | | | | | | Original patch by Tim Heaney.
* | Issue #18743: Fix references to non-existant "StringIO" moduleSerhiy Storchaka2013-08-291-1/+1
|\ \ | |/ | | | | in docstrings and comments.
| * Issue #18743: Fix references to non-existant "StringIO" moduleSerhiy Storchaka2013-08-291-1/+1
| | | | | | | | in docstrings and comments.
| * Back out patch for #1159051, which caused backwards compatibility problems.Georg Brandl2013-05-121-37/+44
| |
* | Close #17666: Fix reading gzip files with an extra field.Serhiy Storchaka2013-04-081-1/+2
|\ \ | |/
| * Close #17666: Fix reading gzip files with an extra field.Serhiy Storchaka2013-04-081-1/+2
| |
* | Issue #1159051: GzipFile now raises EOFError when reading a corrupted fileSerhiy Storchaka2013-01-221-44/+37
|\ \ | |/ | | | | | | with truncated header or footer. Added tests for reading truncated gzip, bzip2, and lzma files.
| * Issue #1159051: GzipFile now raises EOFError when reading a corrupted fileSerhiy Storchaka2013-01-221-44/+37
| |\ | | | | | | | | | | | | with truncated header or footer. Added tests for reading truncated gzip, bzip2, and lzma files.
| | * Issue #1159051: GzipFile now raises EOFError when reading a corrupted fileSerhiy Storchaka2013-01-221-38/+34
| | | | | | | | | | | | | | | with truncated header or footer. Added tests for reading truncated gzip and bzip2 files.
| | * #15546: Fix GzipFile.peek()'s handling of pathological input data.Serhiy Storchaka2013-01-221-2/+4
| | | | | | | | | | | | This is a backport of changeset 8c07ff7f882f.
* | | Replace IOError with OSError (#16715)Andrew Svetlov2012-12-251-10/+10
|/ /
* | Issue #15677: Document that zlib and gzip accept a compression level of 0 to ↵Nadeem Vawda2012-11-111-3/+4
|\ \ | |/ | | | | | | | | mean 'no compression'. Patch by Brian Brazil.
| * Issue #15677: Document that zlib and gzip accept a compression level of 0 to ↵Nadeem Vawda2012-11-111-3/+4
| | | | | | | | | | | | mean 'no compression'. Patch by Brian Brazil.
* | Issue #15800: fix the closing of input / output files when gzip is used as a ↵Antoine Pitrou2012-08-291-2/+2
|\ \ | |/ | | | | script.
| * Issue #15800: fix the closing of input / output files when gzip is used as a ↵Antoine Pitrou2012-08-291-2/+2
| | | | | | | | script.
* | #15546: Also fix GzipFile.peek().Nadeem Vawda2012-08-051-2/+4
| |
* | #15546: Fix {GzipFile,LZMAFile}.read1()'s handling of pathological input data.Nadeem Vawda2012-08-051-1/+4
| |
* | Update GzipFile docstring to mention gzip.open()'s new text-mode support.Nadeem Vawda2012-06-301-1/+1
| |