diff options
Diffstat (limited to 'Utilities/cmlibarchive/libarchive/libarchive-formats.5')
-rw-r--r-- | Utilities/cmlibarchive/libarchive/libarchive-formats.5 | 361 |
1 files changed, 361 insertions, 0 deletions
diff --git a/Utilities/cmlibarchive/libarchive/libarchive-formats.5 b/Utilities/cmlibarchive/libarchive/libarchive-formats.5 new file mode 100644 index 0000000..bd6bfe7 --- /dev/null +++ b/Utilities/cmlibarchive/libarchive/libarchive-formats.5 @@ -0,0 +1,361 @@ +.\" Copyright (c) 2003-2009 Tim Kientzle +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD: head/lib/libarchive/libarchive-formats.5 201077 2009-12-28 01:50:23Z kientzle $ +.\" +.Dd December 27, 2009 +.Dt libarchive-formats 5 +.Os +.Sh NAME +.Nm libarchive-formats +.Nd archive formats supported by the libarchive library +.Sh DESCRIPTION +The +.Xr libarchive 3 +library reads and writes a variety of streaming archive formats. +Generally speaking, all of these archive formats consist of a series of +.Dq entries . +Each entry stores a single file system object, such as a file, directory, +or symbolic link. +.Pp +The following provides a brief description of each format supported +by libarchive, with some information about recognized extensions or +limitations of the current library support. +Note that just because a format is supported by libarchive does not +imply that a program that uses libarchive will support that format. +Applications that use libarchive specify which formats they wish +to support, though many programs do use libarchive convenience +functions to enable all supported formats. +.Ss Tar Formats +The +.Xr libarchive 3 +library can read most tar archives. +However, it only writes POSIX-standard +.Dq ustar +and +.Dq pax interchange +formats. +.Pp +All tar formats store each entry in one or more 512-byte records. +The first record is used for file metadata, including filename, +timestamp, and mode information, and the file data is stored in +subsequent records. +Later variants have extended this by either appropriating undefined +areas of the header record, extending the header to multiple records, +or by storing special entries that modify the interpretation of +subsequent entries. +.Pp +.Bl -tag -width indent +.It Cm gnutar +The +.Xr libarchive 3 +library can read GNU-format tar archives. +It currently supports the most popular GNU extensions, including +modern long filename and linkname support, as well as atime and ctime data. +The libarchive library does not support multi-volume +archives, nor the old GNU long filename format. +It can read GNU sparse file entries, including the new POSIX-based +formats, but cannot write GNU sparse file entries. +.It Cm pax +The +.Xr libarchive 3 +library can read and write POSIX-compliant pax interchange format +archives. +Pax interchange format archives are an extension of the older ustar +format that adds a separate entry with additional attributes stored +as key/value pairs immediately before each regular entry. +The presence of these additional entries is the only difference between +pax interchange format and the older ustar format. +The extended attributes are of unlimited length and are stored +as UTF-8 Unicode strings. +Keywords defined in the standard are in all lowercase; vendors are allowed +to define custom keys by preceding them with the vendor name in all uppercase. +When writing pax archives, libarchive uses many of the SCHILY keys +defined by Joerg Schilling's +.Dq star +archiver and a few LIBARCHIVE keys. +The libarchive library can read most of the SCHILY keys +and most of the GNU keys introduced by GNU tar. +It silently ignores any keywords that it does not understand. +.It Cm restricted pax +The libarchive library can also write pax archives in which it +attempts to suppress the extended attributes entry whenever +possible. +The result will be identical to a ustar archive unless the +extended attributes entry is required to store a long file +name, long linkname, extended ACL, file flags, or if any of the standard +ustar data (user name, group name, UID, GID, etc) cannot be fully +represented in the ustar header. +In all cases, the result can be dearchived by any program that +can read POSIX-compliant pax interchange format archives. +Programs that correctly read ustar format (see below) will also be +able to read this format; any extended attributes will be extracted as +separate files stored in +.Pa PaxHeader +directories. +.It Cm ustar +The libarchive library can both read and write this format. +This format has the following limitations: +.Bl -bullet -compact +.It +Device major and minor numbers are limited to 21 bits. +Nodes with larger numbers will not be added to the archive. +.It +Path names in the archive are limited to 255 bytes. +(Shorter if there is no / character in exactly the right place.) +.It +Symbolic links and hard links are stored in the archive with +the name of the referenced file. +This name is limited to 100 bytes. +.It +Extended attributes, file flags, and other extended +security information cannot be stored. +.It +Archive entries are limited to 8 gigabytes in size. +.El +Note that the pax interchange format has none of these restrictions. +.El +.Pp +The libarchive library also reads a variety of commonly-used extensions to +the basic tar format. +These extensions are recognized automatically whenever they appear. +.Bl -tag -width indent +.It Numeric extensions. +The POSIX standards require fixed-length numeric fields to be written with +some character position reserved for terminators. +Libarchive allows these fields to be written without terminator characters. +This extends the allowable range; in particular, ustar archives with this +extension can support entries up to 64 gigabytes in size. +Libarchive also recognizes base-256 values in most numeric fields. +This essentially removes all limitations on file size, modification time, +and device numbers. +.It Solaris extensions +Libarchive recognizes ACL and extended attribute records written +by Solaris tar. +Currently, libarchive only has support for old-style ACLs; the +newer NFSv4 ACLs are recognized but discarded. +.El +.Pp +The first tar program appeared in Seventh Edition Unix in 1979. +The first official standard for the tar file format was the +.Dq ustar +(Unix Standard Tar) format defined by POSIX in 1988. +POSIX.1-2001 extended the ustar format to create the +.Dq pax interchange +format. +.Ss Cpio Formats +The libarchive library can read a number of common cpio variants and can write +.Dq odc +and +.Dq newc +format archives. +A cpio archive stores each entry as a fixed-size header followed +by a variable-length filename and variable-length data. +Unlike the tar format, the cpio format does only minimal padding +of the header or file data. +There are several cpio variants, which differ primarily in +how they store the initial header: some store the values as +octal or hexadecimal numbers in ASCII, others as binary values of +varying byte order and length. +.Bl -tag -width indent +.It Cm binary +The libarchive library transparently reads both big-endian and little-endian +variants of the original binary cpio format. +This format used 32-bit binary values for file size and mtime, +and 16-bit binary values for the other fields. +.It Cm odc +The libarchive library can both read and write this +POSIX-standard format, which is officially known as the +.Dq cpio interchange format +or the +.Dq octet-oriented cpio archive format +and sometimes unofficially referred to as the +.Dq old character format . +This format stores the header contents as octal values in ASCII. +It is standard, portable, and immune from byte-order confusion. +File sizes and mtime are limited to 33 bits (8GB file size), +other fields are limited to 18 bits. +.It Cm SVR4 +The libarchive library can read both CRC and non-CRC variants of +this format. +The SVR4 format uses eight-digit hexadecimal values for +all header fields. +This limits file size to 4GB, and also limits the mtime and +other fields to 32 bits. +The SVR4 format can optionally include a CRC of the file +contents, although libarchive does not currently verify this CRC. +.El +.Pp +Cpio first appeared in PWB/UNIX 1.0, which was released within +AT&T in 1977. +PWB/UNIX 1.0 formed the basis of System III Unix, released outside +of AT&T in 1981. +This makes cpio older than tar, although cpio was not included +in Version 7 AT&T Unix. +As a result, the tar command became much better known in universities +and research groups that used Version 7. +The combination of the +.Nm find +and +.Nm cpio +utilities provided very precise control over file selection. +Unfortunately, the format has many limitations that make it unsuitable +for widespread use. +Only the POSIX format permits files over 4GB, and its 18-bit +limit for most other fields makes it unsuitable for modern systems. +In addition, cpio formats only store numeric UID/GID values (not +usernames and group names), which can make it very difficult to correctly +transfer archives across systems with dissimilar user numbering. +.Ss Shar Formats +A +.Dq shell archive +is a shell script that, when executed on a POSIX-compliant +system, will recreate a collection of file system objects. +The libarchive library can write two different kinds of shar archives: +.Bl -tag -width indent +.It Cm shar +The traditional shar format uses a limited set of POSIX +commands, including +.Xr echo 1 , +.Xr mkdir 1 , +and +.Xr sed 1 . +It is suitable for portably archiving small collections of plain text files. +However, it is not generally well-suited for large archives +(many implementations of +.Xr sh 1 +have limits on the size of a script) nor should it be used with non-text files. +.It Cm shardump +This format is similar to shar but encodes files using +.Xr uuencode 1 +so that the result will be a plain text file regardless of the file contents. +It also includes additional shell commands that attempt to reproduce as +many file attributes as possible, including owner, mode, and flags. +The additional commands used to restore file attributes make +shardump archives less portable than plain shar archives. +.El +.Ss ISO9660 format +Libarchive can read and extract from files containing ISO9660-compliant +CDROM images. +In many cases, this can remove the need to burn a physical CDROM +just in order to read the files contained in an ISO9660 image. +It also avoids security and complexity issues that come with +virtual mounts and loopback devices. +Libarchive supports the most common Rockridge extensions and has partial +support for Joliet extensions. +If both extensions are present, the Joliet extensions will be +used and the Rockridge extensions will be ignored. +In particular, this can create problems with hardlinks and symlinks, +which are supported by Rockridge but not by Joliet. +.Ss Zip format +Libarchive can read and write zip format archives that have +uncompressed entries and entries compressed with the +.Dq deflate +algorithm. +Older zip compression algorithms are not supported. +It can extract jar archives, archives that use Zip64 extensions and many +self-extracting zip archives. +Libarchive reads Zip archives as they are being streamed, +which allows it to read archives of arbitrary size. +It currently does not use the central directory; this +limits libarchive's ability to support some self-extracting +archives and ones that have been modified in certain ways. +.Ss Archive (library) file format +The Unix archive format (commonly created by the +.Xr ar 1 +archiver) is a general-purpose format which is +used almost exclusively for object files to be +read by the link editor +.Xr ld 1 . +The ar format has never been standardised. +There are two common variants: +the GNU format derived from SVR4, +and the BSD format, which first appeared in 4.4BSD. +The two differ primarily in their handling of filenames +longer than 15 characters: +the GNU/SVR4 variant writes a filename table at the beginning of the archive; +the BSD format stores each long filename in an extension +area adjacent to the entry. +Libarchive can read both extensions, +including archives that may include both types of long filenames. +Programs using libarchive can write GNU/SVR4 format +if they provide a filename table to be written into +the archive before any of the entries. +Any entries whose names are not in the filename table +will be written using BSD-style long filenames. +This can cause problems for programs such as +GNU ld that do not support the BSD-style long filenames. +.Ss mtree +Libarchive can read and write files in +.Xr mtree 5 +format. +This format is not a true archive format, but rather a textual description +of a file hierarchy in which each line specifies the name of a file and +provides specific metadata about that file. +Libarchive can read all of the keywords supported by both +the NetBSD and FreeBSD versions of +.Xr mtree 1 , +although many of the keywords cannot currently be stored in an +.Tn archive_entry +object. +When writing, libarchive supports use of the +.Xr archive_write_set_options 3 +interface to specify which keywords should be included in the +output. +If libarchive was compiled with access to suitable +cryptographic libraries (such as the OpenSSL libraries), +it can compute hash entries such as +.Cm sha512 +or +.Cm md5 +from file data being written to the mtree writer. +.Pp +When reading an mtree file, libarchive will locate the corresponding +files on disk using the +.Cm contents +keyword if present or the regular filename. +If it can locate and open the file on disk, it will use that +to fill in any metadata that is missing from the mtree file +and will read the file contents and return those to the program +using libarchive. +If it cannot locate and open the file on disk, libarchive +will return an error for any attempt to read the entry +body. +.Ss RAR +libarchive has limited support to read files in RAR format. Currently, +libarchive can read single RAR files in RARv3 format which have been either +created uncompressed, or compressed using any of the compression methods +supported by the RARv3 format. libarchive can also extract RAR files which have +been created as self-extracting RAR files. +.Sh SEE ALSO +.Xr ar 1 , +.Xr cpio 1 , +.Xr mkisofs 1 , +.Xr shar 1 , +.Xr tar 1 , +.Xr zip 1 , +.Xr zlib 3 , +.Xr cpio 5 , +.Xr mtree 5 , +.Xr tar 5 |