diff options
Diffstat (limited to 'Doc/library/tarfile.rst')
-rw-r--r-- | Doc/library/tarfile.rst | 99 |
1 files changed, 76 insertions, 23 deletions
diff --git a/Doc/library/tarfile.rst b/Doc/library/tarfile.rst index d5a511e..9b7071b 100644 --- a/Doc/library/tarfile.rst +++ b/Doc/library/tarfile.rst @@ -8,6 +8,9 @@ .. moduleauthor:: Lars Gustäbel <lars@gustaebel.de> .. sectionauthor:: Lars Gustäbel <lars@gustaebel.de> +**Source code:** :source:`Lib/tarfile.py` + +-------------- The :mod:`tarfile` module makes it possible to read and write tar archives, including those using gzip or bz2 compression. @@ -20,7 +23,8 @@ Some facts and figures: * read/write support for the POSIX.1-1988 (ustar) format. * read/write support for the GNU tar format including *longname* and *longlink* - extensions, read-only support for the *sparse* extension. + extensions, read-only support for all variants of the *sparse* extension + including restoration of sparse files. * read/write support for the POSIX.1-2001 (pax) format. @@ -185,8 +189,8 @@ The following variables are available on module level: .. data:: ENCODING - The default character encoding i.e. the value from either - :func:`sys.getfilesystemencoding` or :func:`sys.getdefaultencoding`. + The default character encoding: ``'utf-8'`` on Windows, + :func:`sys.getfilesystemencoding` otherwise. .. seealso:: @@ -209,8 +213,16 @@ a header block followed by data blocks. It is possible to store a file in a tar archive several times. Each archive member is represented by a :class:`TarInfo` object, see :ref:`tarinfo-objects` for details. +A :class:`TarFile` object can be used as a context manager in a :keyword:`with` +statement. It will automatically be closed when the block is completed. Please +note that in the event of an exception an archive opened for writing will not +be finalized; only the internally used file object will be closed. See the +:ref:`tar-examples` section for a use case. -.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors=None, pax_headers=None, debug=0, errorlevel=0) +.. versionadded:: 3.2 + Added support for the context manager protocol. + +.. class:: TarFile(name=None, mode='r', fileobj=None, format=DEFAULT_FORMAT, tarinfo=TarInfo, dereference=False, ignore_zeros=False, encoding=ENCODING, errors='surrogateescape', pax_headers=None, debug=0, errorlevel=0) All following arguments are optional and can be accessed as instance attributes as well. @@ -259,6 +271,9 @@ object, see :ref:`tarinfo-objects` for details. to be handled. The default settings will work for most users. See section :ref:`tar-unicode` for in-depth information. + .. versionchanged:: 3.2 + Use ``'surrogateescape'`` as the default for the *errors* argument. + The *pax_headers* argument is an optional dictionary of strings which will be added as a pax global header if *format* is :const:`PAX_FORMAT`. @@ -324,12 +339,13 @@ object, see :ref:`tarinfo-objects` for details. dots ``".."``. -.. method:: TarFile.extract(member, path="") +.. method:: TarFile.extract(member, path="", set_attrs=True) Extract a member from the archive to the current working directory, using its full name. Its file information is extracted as accurately as possible. *member* may be a filename or a :class:`TarInfo` object. You can specify a different - directory using *path*. + directory using *path*. File attributes (owner, mtime, mode) are set unless + *set_attrs* is False. .. note:: @@ -340,6 +356,8 @@ object, see :ref:`tarinfo-objects` for details. See the warning for :meth:`extractall`. + .. versionchanged:: 3.2 + Added the *set_attrs* parameter. .. method:: TarFile.extractfile(member) @@ -355,15 +373,27 @@ object, see :ref:`tarinfo-objects` for details. and :meth:`close`, and also supports iteration over its lines. -.. method:: TarFile.add(name, arcname=None, recursive=True, exclude=None) +.. method:: TarFile.add(name, arcname=None, recursive=True, exclude=None, *, filter=None) - Add the file *name* to the archive. *name* may be any type of file (directory, - fifo, symbolic link, etc.). If given, *arcname* specifies an alternative name - for the file in the archive. Directories are added recursively by default. This - can be avoided by setting *recursive* to :const:`False`. If *exclude* is given, - it must be a function that takes one filename argument and returns a boolean - value. Depending on this value the respective file is either excluded - (:const:`True`) or added (:const:`False`). + Add the file *name* to the archive. *name* may be any type of file + (directory, fifo, symbolic link, etc.). If given, *arcname* specifies an + alternative name for the file in the archive. Directories are added + recursively by default. This can be avoided by setting *recursive* to + :const:`False`. If *exclude* is given, it must be a function that takes one + filename argument and returns a boolean value. Depending on this value the + respective file is either excluded (:const:`True`) or added + (:const:`False`). If *filter* is specified it must be a keyword argument. It + should be a function that takes a :class:`TarInfo` object argument and + returns the changed :class:`TarInfo` object. If it instead returns + :const:`None` the :class:`TarInfo` object will be excluded from the + archive. See :ref:`tar-examples` for an example. + + .. versionchanged:: 3.2 + Added the *filter* parameter. + + .. deprecated:: 3.2 + The *exclude* parameter is deprecated, please use the *filter* parameter + instead. .. method:: TarFile.addfile(tarinfo, fileobj=None) @@ -430,11 +460,14 @@ It does *not* contain the file's data itself. a :class:`TarInfo` object. -.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='strict') +.. method:: TarInfo.tobuf(format=DEFAULT_FORMAT, encoding=ENCODING, errors='surrogateescape') Create a string buffer from a :class:`TarInfo` object. For information on the arguments see the constructor of the :class:`TarFile` class. + .. versionchanged:: 3.2 + Use ``'surrogateescape'`` as the default for the *errors* argument. + A ``TarInfo`` object has the following public data attributes: @@ -582,6 +615,13 @@ How to create an uncompressed tar archive from a list of filenames:: tar.add(name) tar.close() +The same example using the :keyword:`with` statement:: + + import tarfile + with tarfile.open("sample.tar", "w") as tar: + for name in ["foo", "bar", "quux"]: + tar.add(name) + How to read a gzip compressed tar archive and display some member information:: import tarfile @@ -596,6 +636,18 @@ How to read a gzip compressed tar archive and display some member information:: print("something else.") tar.close() +How to create an archive and reset the user information using the *filter* +parameter in :meth:`TarFile.add`:: + + import tarfile + def reset(tarinfo): + tarinfo.uid = tarinfo.gid = 0 + tarinfo.uname = tarinfo.gname = "root" + return tarinfo + tar = tarfile.open("sample.tar.gz", "w:gz") + tar.add("foo", filter=reset) + tar.close() + .. _tar-formats: @@ -663,11 +715,12 @@ metadata must be either decoded or encoded. If *encoding* is not set appropriately, this conversion may fail. The *errors* argument defines how characters are treated that cannot be -converted. Possible values are listed in section :ref:`codec-base-classes`. In -read mode the default scheme is ``'replace'``. This avoids unexpected -:exc:`UnicodeError` exceptions and guarantees that an archive can always be -read. In write mode the default value for *errors* is ``'strict'``. This -ensures that name information is not altered unnoticed. - -In case of writing :const:`PAX_FORMAT` archives, *encoding* is ignored because -non-ASCII metadata is stored using *UTF-8*. +converted. Possible values are listed in section :ref:`codec-base-classes`. +The default scheme is ``'surrogateescape'`` which Python also uses for its +file system calls, see :ref:`os-filenames`. + +In case of :const:`PAX_FORMAT` archives, *encoding* is generally not needed +because all the metadata is stored using *UTF-8*. *encoding* is only used in +the rare cases when binary pax headers are decoded or when strings with +surrogate characters are stored. + |