summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorSerhiy Storchaka <storchaka@gmail.com>2022-03-22 09:52:55 (GMT)
committerGitHub <noreply@github.com>2022-03-22 09:52:55 (GMT)
commita25a985535ccbb7df8caddc0017550ff4eae5855 (patch)
tree44d9b90b89baba06da89b3f044b69aa40edd03c7 /Doc
parentc6cd3cc93c40363ce704d34a70e6fb73ea1d97a3 (diff)
downloadcpython-a25a985535ccbb7df8caddc0017550ff4eae5855.zip
cpython-a25a985535ccbb7df8caddc0017550ff4eae5855.tar.gz
cpython-a25a985535ccbb7df8caddc0017550ff4eae5855.tar.bz2
bpo-28080: Add support for the fallback encoding in ZIP files (GH-32007)
* Add the metadata_encoding parameter in the zipfile.ZipFile constructor. * Add the --metadata-encoding option in the zipfile CLI. Co-authored-by: Stephen J. Turnbull <stephen@xemacs.org>
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/zipfile.rst41
-rw-r--r--Doc/whatsnew/3.11.rst6
2 files changed, 46 insertions, 1 deletions
diff --git a/Doc/library/zipfile.rst b/Doc/library/zipfile.rst
index 9d0d894..bfcc883 100644
--- a/Doc/library/zipfile.rst
+++ b/Doc/library/zipfile.rst
@@ -139,7 +139,8 @@ ZipFile Objects
.. class:: ZipFile(file, mode='r', compression=ZIP_STORED, allowZip64=True, \
- compresslevel=None, *, strict_timestamps=True)
+ compresslevel=None, *, strict_timestamps=True,
+ metadata_encoding=None)
Open a ZIP file, where *file* can be a path to a file (a string), a
file-like object or a :term:`path-like object`.
@@ -183,6 +184,10 @@ ZipFile Objects
Similar behavior occurs with files newer than 2107-12-31,
the timestamp is also set to the limit.
+ When mode is ``'r'``, *metadata_encoding* may be set to the name of a codec,
+ which will be used to decode metadata such as the names of members and ZIP
+ comments.
+
If the file is created with mode ``'w'``, ``'x'`` or ``'a'`` and then
:meth:`closed <close>` without adding any files to the archive, the appropriate
ZIP structures for an empty archive will be written to the file.
@@ -194,6 +199,19 @@ ZipFile Objects
with ZipFile('spam.zip', 'w') as myzip:
myzip.write('eggs.txt')
+ .. note::
+
+ *metadata_encoding* is an instance-wide setting for the ZipFile.
+ It is not currently possible to set this on a per-member basis.
+
+ This attribute is a workaround for legacy implementations which produce
+ archives with names in the current locale encoding or code page (mostly
+ on Windows). According to the .ZIP standard, the encoding of metadata
+ may be specified to be either IBM code page (default) or UTF-8 by a flag
+ in the archive header.
+ That flag takes precedence over *metadata_encoding*, which is
+ a Python-specific extension.
+
.. versionadded:: 3.2
Added the ability to use :class:`ZipFile` as a context manager.
@@ -220,6 +238,10 @@ ZipFile Objects
.. versionadded:: 3.8
The *strict_timestamps* keyword-only argument
+ .. versionchanged:: 3.11
+ Added support for specifying member name encoding for reading
+ metadata in the zipfile's directory and file headers.
+
.. method:: ZipFile.close()
@@ -397,6 +419,15 @@ ZipFile Objects
.. note::
+ The ZIP file standard historically did not specify a metadata encoding,
+ but strongly recommended CP437 (the original IBM PC encoding) for
+ interoperability. Recent versions allow use of UTF-8 (only). In this
+ module, UTF-8 will automatically be used to write the member names if
+ they contain any non-ASCII characters. It is not possible to write
+ member names in any encoding other than ASCII or UTF-8.
+
+ .. note::
+
Archive names should be relative to the archive root, that is, they should not
start with a path separator.
@@ -868,6 +899,14 @@ Command-line options
Test whether the zipfile is valid or not.
+.. cmdoption:: --metadata-encoding <encoding>
+
+ Specify encoding of member names for :option:`-l`, :option:`-e` and
+ :option:`-t`.
+
+ .. versionadded:: 3.11
+
+
Decompression pitfalls
----------------------
diff --git a/Doc/whatsnew/3.11.rst b/Doc/whatsnew/3.11.rst
index 96db3a9..938a573 100644
--- a/Doc/whatsnew/3.11.rst
+++ b/Doc/whatsnew/3.11.rst
@@ -432,6 +432,12 @@ venv
Third party code that also creates new virtual environments should do the same.
(Contributed by Miro HronĨok in :issue:`45413`.)
+zipfile
+-------
+
+* Added support for specifying member name encoding for reading
+ metadata in the zipfile's directory and file headers.
+ (Contributed by Stephen J. Turnbull and Serhiy Storchaka in :issue:`28080`.)
fcntl
-----