diff options
author | Serhiy Storchaka <storchaka@gmail.com> | 2022-03-22 09:52:55 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-03-22 09:52:55 (GMT) |
commit | a25a985535ccbb7df8caddc0017550ff4eae5855 (patch) | |
tree | 44d9b90b89baba06da89b3f044b69aa40edd03c7 /Doc | |
parent | c6cd3cc93c40363ce704d34a70e6fb73ea1d97a3 (diff) | |
download | cpython-a25a985535ccbb7df8caddc0017550ff4eae5855.zip cpython-a25a985535ccbb7df8caddc0017550ff4eae5855.tar.gz cpython-a25a985535ccbb7df8caddc0017550ff4eae5855.tar.bz2 |
bpo-28080: Add support for the fallback encoding in ZIP files (GH-32007)
* Add the metadata_encoding parameter in the zipfile.ZipFile constructor.
* Add the --metadata-encoding option in the zipfile CLI.
Co-authored-by: Stephen J. Turnbull <stephen@xemacs.org>
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/zipfile.rst | 41 | ||||
-rw-r--r-- | Doc/whatsnew/3.11.rst | 6 |
2 files changed, 46 insertions, 1 deletions
diff --git a/Doc/library/zipfile.rst b/Doc/library/zipfile.rst index 9d0d894..bfcc883 100644 --- a/Doc/library/zipfile.rst +++ b/Doc/library/zipfile.rst @@ -139,7 +139,8 @@ ZipFile Objects .. class:: ZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, \ - compresslevel=None, *, strict_timestamps=True) + compresslevel=None, *, strict_timestamps=True, + metadata_encoding=None) Open a ZIP file, where *file* can be a path to a file (a string), a file-like object or a :term:`path-like object`. @@ -183,6 +184,10 @@ ZipFile Objects Similar behavior occurs with files newer than 2107-12-31, the timestamp is also set to the limit. + When mode is ``'r'``, *metadata_encoding* may be set to the name of a codec, + which will be used to decode metadata such as the names of members and ZIP + comments. + If the file is created with mode ``'w'``, ``'x'`` or ``'a'`` and then :meth:`closed <close>` without adding any files to the archive, the appropriate ZIP structures for an empty archive will be written to the file. @@ -194,6 +199,19 @@ ZipFile Objects with ZipFile('spam.zip', 'w') as myzip: myzip.write('eggs.txt') + .. note:: + + *metadata_encoding* is an instance-wide setting for the ZipFile. + It is not currently possible to set this on a per-member basis. + + This attribute is a workaround for legacy implementations which produce + archives with names in the current locale encoding or code page (mostly + on Windows). According to the .ZIP standard, the encoding of metadata + may be specified to be either IBM code page (default) or UTF-8 by a flag + in the archive header. + That flag takes precedence over *metadata_encoding*, which is + a Python-specific extension. + .. versionadded:: 3.2 Added the ability to use :class:`ZipFile` as a context manager. @@ -220,6 +238,10 @@ ZipFile Objects .. versionadded:: 3.8 The *strict_timestamps* keyword-only argument + .. versionchanged:: 3.11 + Added support for specifying member name encoding for reading + metadata in the zipfile's directory and file headers. + .. method:: ZipFile.close() @@ -397,6 +419,15 @@ ZipFile Objects .. note:: + The ZIP file standard historically did not specify a metadata encoding, + but strongly recommended CP437 (the original IBM PC encoding) for + interoperability. Recent versions allow use of UTF-8 (only). In this + module, UTF-8 will automatically be used to write the member names if + they contain any non-ASCII characters. It is not possible to write + member names in any encoding other than ASCII or UTF-8. + + .. note:: + Archive names should be relative to the archive root, that is, they should not start with a path separator. @@ -868,6 +899,14 @@ Command-line options Test whether the zipfile is valid or not. +.. cmdoption:: --metadata-encoding <encoding> + + Specify encoding of member names for :option:`-l`, :option:`-e` and + :option:`-t`. + + .. versionadded:: 3.11 + + Decompression pitfalls ---------------------- diff --git a/Doc/whatsnew/3.11.rst b/Doc/whatsnew/3.11.rst index 96db3a9..938a573 100644 --- a/Doc/whatsnew/3.11.rst +++ b/Doc/whatsnew/3.11.rst @@ -432,6 +432,12 @@ venv Third party code that also creates new virtual environments should do the same. (Contributed by Miro HronĨok in :issue:`45413`.) +zipfile +------- + +* Added support for specifying member name encoding for reading + metadata in the zipfile's directory and file headers. + (Contributed by Stephen J. Turnbull and Serhiy Storchaka in :issue:`28080`.) fcntl ----- |