summaryrefslogtreecommitdiffstats
path: root/Lib/test/test_tarfile.py
diff options
context:
space:
mode:
authorGuido van Rossum <guido@python.org>2007-06-06 23:52:48 (GMT)
committerGuido van Rossum <guido@python.org>2007-06-06 23:52:48 (GMT)
commite7ba4956272a7105ea90dd505f70e5947aa27161 (patch)
treeb28c14ab345faf72d32ae96639f8e1d2629e1761 /Lib/test/test_tarfile.py
parent0e41148c4bdb3b1af157a9bf55df4bc27474f1e8 (diff)
downloadcpython-e7ba4956272a7105ea90dd505f70e5947aa27161.zip
cpython-e7ba4956272a7105ea90dd505f70e5947aa27161.tar.gz
cpython-e7ba4956272a7105ea90dd505f70e5947aa27161.tar.bz2
Merged revisions 55631-55794 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/p3yk ................ r55636 | neal.norwitz | 2007-05-29 00:06:39 -0700 (Tue, 29 May 2007) | 149 lines Merged revisions 55506-55635 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r55507 | georg.brandl | 2007-05-22 07:28:17 -0700 (Tue, 22 May 2007) | 2 lines Remove the "panel" module doc file which has been ignored since 1994. ........ r55522 | mark.hammond | 2007-05-22 19:04:28 -0700 (Tue, 22 May 2007) | 4 lines Remove definition of PY_UNICODE_TYPE from pyconfig.h, allowing the definition in unicodeobject.h to be used, giving us the desired wchar_t in place of 'unsigned short'. As discussed on python-dev. ........ r55525 | neal.norwitz | 2007-05-22 23:35:32 -0700 (Tue, 22 May 2007) | 6 lines Add -3 option to the interpreter to warn about features that are deprecated and will be changed/removed in Python 3.0. This patch is mostly from Anthony. I tweaked some format and added a little doc. ........ r55527 | neal.norwitz | 2007-05-22 23:57:35 -0700 (Tue, 22 May 2007) | 1 line Whitespace cleanup ........ r55528 | neal.norwitz | 2007-05-22 23:58:36 -0700 (Tue, 22 May 2007) | 1 line Add a bunch more deprecation warnings for builtins that are going away in 3.0 ........ r55549 | georg.brandl | 2007-05-24 09:49:29 -0700 (Thu, 24 May 2007) | 2 lines shlex.split() now has an optional "posix" parameter. ........ r55550 | georg.brandl | 2007-05-24 10:33:33 -0700 (Thu, 24 May 2007) | 2 lines Fix parameter passing. ........ r55555 | facundo.batista | 2007-05-24 10:50:54 -0700 (Thu, 24 May 2007) | 6 lines Added an optional timeout parameter to urllib.ftpwrapper, with tests (for this and a basic one, because there weren't any). Changed also NEWS, but didn't find documentation for this function, assumed it wasn't public... ........ r55563 | facundo.batista | 2007-05-24 13:01:59 -0700 (Thu, 24 May 2007) | 4 lines Removed the .recv() in the test, is not necessary, and was causing problems that didn't have anything to do with was actually being tested... ........ r55564 | facundo.batista | 2007-05-24 13:51:19 -0700 (Thu, 24 May 2007) | 5 lines Let's see if reading exactly what is written allow this live test to pass (now I know why there were so few tests in ftp, http, etc, :( ). ........ r55567 | facundo.batista | 2007-05-24 20:10:28 -0700 (Thu, 24 May 2007) | 4 lines Trying to make the tests work in Windows and Solaris, everywhere else just works ........ r55568 | facundo.batista | 2007-05-24 20:47:19 -0700 (Thu, 24 May 2007) | 4 lines Fixing stupid error, and introducing a sleep, to see if the other thread is awakened and finish sending data. ........ r55569 | facundo.batista | 2007-05-24 21:20:22 -0700 (Thu, 24 May 2007) | 4 lines Commenting out the tests until find out who can test them in one of the problematic enviroments. ........ r55570 | neal.norwitz | 2007-05-24 22:13:40 -0700 (Thu, 24 May 2007) | 2 lines Get test passing again by commenting out the reference to the test class. ........ r55575 | vinay.sajip | 2007-05-25 00:05:59 -0700 (Fri, 25 May 2007) | 1 line Updated docstring for SysLogHandler (#1720726). ........ r55576 | vinay.sajip | 2007-05-25 00:06:55 -0700 (Fri, 25 May 2007) | 1 line Updated documentation for SysLogHandler (#1720726). ........ r55592 | brett.cannon | 2007-05-25 13:17:15 -0700 (Fri, 25 May 2007) | 3 lines Remove direct call's to file's constructor and replace them with calls to open() as ths is considered best practice. ........ r55601 | kristjan.jonsson | 2007-05-26 12:19:50 -0700 (Sat, 26 May 2007) | 1 line Remove the rgbimgmodule from PCBuild8 ........ r55602 | kristjan.jonsson | 2007-05-26 12:31:39 -0700 (Sat, 26 May 2007) | 1 line Include <windows.h> after python.h, so that WINNT is properly set before windows.h is included. Fixes warnings in PC builds. ........ r55603 | walter.doerwald | 2007-05-26 14:04:13 -0700 (Sat, 26 May 2007) | 2 lines Fix typo. ........ r55604 | peter.astrand | 2007-05-26 15:18:20 -0700 (Sat, 26 May 2007) | 1 line Applied patch 1669481, slightly modified: Support close_fds on Win32 ........ r55606 | neal.norwitz | 2007-05-26 21:08:54 -0700 (Sat, 26 May 2007) | 2 lines Add the new function object attribute names from py3k. ........ r55617 | lars.gustaebel | 2007-05-27 12:49:30 -0700 (Sun, 27 May 2007) | 20 lines Added errors argument to TarFile class that allows the user to specify an error handling scheme for character conversion. Additional scheme "utf-8" in read mode. Unicode input filenames are now supported by design. The values of the pax_headers dictionary are now limited to unicode objects. Fixed: The prefix field is no longer used in PAX_FORMAT (in conformance with POSIX). Fixed: In read mode use a possible pax header size field. Fixed: Strip trailing slashes from pax header name values. Fixed: Give values in user-specified pax_headers precedence when writing. Added unicode tests. Added pax/regtype4 member to testtar.tar all possible number fields in a pax header. Added two chapters to the documentation about the different formats tarfile.py supports and how unicode issues are handled. ........ r55618 | raymond.hettinger | 2007-05-27 22:23:22 -0700 (Sun, 27 May 2007) | 1 line Explain when groupby() issues a new group. ........ r55634 | martin.v.loewis | 2007-05-28 21:01:29 -0700 (Mon, 28 May 2007) | 2 lines Test pre-commit hook for a link to a .py file. ........ r55635 | martin.v.loewis | 2007-05-28 21:02:03 -0700 (Mon, 28 May 2007) | 2 lines Revert 55634. ........ ................ r55639 | neal.norwitz | 2007-05-29 00:58:11 -0700 (Tue, 29 May 2007) | 1 line Remove sys.exc_{type,exc_value,exc_traceback} ................ r55641 | neal.norwitz | 2007-05-29 01:03:50 -0700 (Tue, 29 May 2007) | 1 line Missed one sys.exc_type. I wonder why exc_{value,traceback} were already gone ................ r55642 | neal.norwitz | 2007-05-29 01:08:33 -0700 (Tue, 29 May 2007) | 1 line Missed more doc for sys.exc_* attrs. ................ r55643 | neal.norwitz | 2007-05-29 01:18:19 -0700 (Tue, 29 May 2007) | 1 line Remove sys.exc_clear() ................ r55665 | guido.van.rossum | 2007-05-29 19:45:43 -0700 (Tue, 29 May 2007) | 4 lines Make None, True, False keywords. We can now also delete all the other places that explicitly forbid assignment to None, but I'm not going to bother right now. ................ r55666 | guido.van.rossum | 2007-05-29 20:01:51 -0700 (Tue, 29 May 2007) | 3 lines Found another place that needs check for forbidden names. Fixed test_syntax.py accordingly (it helped me find that one). ................ r55668 | guido.van.rossum | 2007-05-29 20:41:48 -0700 (Tue, 29 May 2007) | 2 lines Mark None, True, False as keywords. ................ r55673 | neal.norwitz | 2007-05-29 23:28:25 -0700 (Tue, 29 May 2007) | 3 lines Get the dis module working on modules again after changing dicts to not return lists and also new-style classes. Add a test. ................ r55674 | neal.norwitz | 2007-05-29 23:35:45 -0700 (Tue, 29 May 2007) | 1 line Umm, it helps to add the module that the test uses ................ r55675 | neal.norwitz | 2007-05-29 23:53:05 -0700 (Tue, 29 May 2007) | 4 lines Try to fix up all the other places that were assigning to True/False. There's at least one more problem in test.test_xmlrpc. I have other changes in that file and that should be fixed soon (I hope). ................ r55679 | neal.norwitz | 2007-05-30 00:31:55 -0700 (Wed, 30 May 2007) | 1 line Fix up another place that was assigning to True/False. ................ r55688 | brett.cannon | 2007-05-30 14:19:47 -0700 (Wed, 30 May 2007) | 2 lines Ditch MimeWriter. ................ r55692 | brett.cannon | 2007-05-30 14:52:00 -0700 (Wed, 30 May 2007) | 2 lines Remove the mimify module. ................ r55707 | guido.van.rossum | 2007-05-31 05:08:45 -0700 (Thu, 31 May 2007) | 2 lines Backport the addition of show_code() to dis.py -- it's too handy. ................ r55708 | guido.van.rossum | 2007-05-31 06:22:57 -0700 (Thu, 31 May 2007) | 7 lines Fix a fairly long-standing bug in the check for assignment to None (and other keywords, these days). In 2.5, you could write foo(None=1) without getting a SyntaxError (although foo()'s definition would have to use **kwds to avoid getting a runtime error complaining about an unknown keyword of course). This ought to be backported to 2.5.2 or at least 2.6. ................ r55724 | brett.cannon | 2007-05-31 19:32:41 -0700 (Thu, 31 May 2007) | 2 lines Remove the cfmfile. ................ r55727 | neal.norwitz | 2007-05-31 22:19:44 -0700 (Thu, 31 May 2007) | 1 line Remove reload() builtin. ................ r55729 | neal.norwitz | 2007-05-31 22:51:30 -0700 (Thu, 31 May 2007) | 59 lines Merged revisions 55636-55728 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r55637 | georg.brandl | 2007-05-29 00:16:47 -0700 (Tue, 29 May 2007) | 2 lines Fix rst markup. ........ r55638 | neal.norwitz | 2007-05-29 00:51:39 -0700 (Tue, 29 May 2007) | 1 line Fix typo in doc ........ r55671 | neal.norwitz | 2007-05-29 21:53:41 -0700 (Tue, 29 May 2007) | 1 line Fix indentation (whitespace only). ........ r55676 | thomas.heller | 2007-05-29 23:58:30 -0700 (Tue, 29 May 2007) | 1 line Fix compiler warnings. ........ r55677 | thomas.heller | 2007-05-30 00:01:25 -0700 (Wed, 30 May 2007) | 2 lines Correct the name of a field in the WIN32_FIND_DATAA and WIN32_FIND_DATAW structures. Closes bug #1726026. ........ r55686 | brett.cannon | 2007-05-30 13:46:26 -0700 (Wed, 30 May 2007) | 2 lines Have MimeWriter raise a DeprecationWarning as per PEP 4 and its documentation. ........ r55690 | brett.cannon | 2007-05-30 14:48:58 -0700 (Wed, 30 May 2007) | 3 lines Have mimify raise a DeprecationWarning. The docs and PEP 4 have listed the module as deprecated for a while. ........ r55696 | brett.cannon | 2007-05-30 15:24:28 -0700 (Wed, 30 May 2007) | 2 lines Have md5 raise a DeprecationWarning as per PEP 4. ........ r55705 | neal.norwitz | 2007-05-30 21:14:22 -0700 (Wed, 30 May 2007) | 1 line Add some spaces in the example code. ........ r55716 | brett.cannon | 2007-05-31 12:20:00 -0700 (Thu, 31 May 2007) | 2 lines Have the sha module raise a DeprecationWarning as specified in PEP 4. ........ r55719 | brett.cannon | 2007-05-31 12:40:42 -0700 (Thu, 31 May 2007) | 2 lines Cause buildtools to raise a DeprecationWarning. ........ r55721 | brett.cannon | 2007-05-31 13:01:11 -0700 (Thu, 31 May 2007) | 2 lines Have cfmfile raise a DeprecationWarning as per PEP 4. ........ r55726 | neal.norwitz | 2007-05-31 21:56:47 -0700 (Thu, 31 May 2007) | 1 line Mail if there is an installation failure. ........ ................ r55730 | neal.norwitz | 2007-05-31 23:22:07 -0700 (Thu, 31 May 2007) | 2 lines Remove the code that was missed in rev 55303. ................ r55738 | neal.norwitz | 2007-06-01 19:10:43 -0700 (Fri, 01 Jun 2007) | 1 line Fix doc breakage ................ r55741 | neal.norwitz | 2007-06-02 00:41:58 -0700 (Sat, 02 Jun 2007) | 1 line Remove timing module (plus some remnants of other modules). ................ r55742 | neal.norwitz | 2007-06-02 00:51:44 -0700 (Sat, 02 Jun 2007) | 1 line Remove posixfile module (plus some remnants of other modules). ................ r55744 | neal.norwitz | 2007-06-02 10:18:56 -0700 (Sat, 02 Jun 2007) | 1 line Fix doc breakage. ................ r55745 | neal.norwitz | 2007-06-02 11:32:16 -0700 (Sat, 02 Jun 2007) | 1 line Make a whatsnew 3.0 template. ................ r55754 | neal.norwitz | 2007-06-03 23:24:18 -0700 (Sun, 03 Jun 2007) | 1 line SF #1730441, os._execvpe raises UnboundLocal due to new try/except semantics ................ r55755 | neal.norwitz | 2007-06-03 23:26:00 -0700 (Sun, 03 Jun 2007) | 1 line Get rid of extra whitespace ................ r55794 | guido.van.rossum | 2007-06-06 15:29:22 -0700 (Wed, 06 Jun 2007) | 3 lines Make this compile in GCC 2.96, which does not allow interspersing declarations and code. ................
Diffstat (limited to 'Lib/test/test_tarfile.py')
-rw-r--r--Lib/test/test_tarfile.py245
1 files changed, 180 insertions, 65 deletions
diff --git a/Lib/test/test_tarfile.py b/Lib/test/test_tarfile.py
index 312050b..636a45e 100644
--- a/Lib/test/test_tarfile.py
+++ b/Lib/test/test_tarfile.py
@@ -1,4 +1,4 @@
-# encoding: iso8859-1
+# -*- coding: iso-8859-15 -*-
import sys
import os
@@ -372,9 +372,9 @@ class LongnameTest(ReadTest):
def test_read_longname(self):
# Test reading of longname (bug #1471427).
- name = self.subdir + "/" + "123/" * 125 + "longname"
+ longname = self.subdir + "/" + "123/" * 125 + "longname"
try:
- tarinfo = self.tar.getmember(name)
+ tarinfo = self.tar.getmember(longname)
except KeyError:
self.fail("longname not found")
self.assert_(tarinfo.type != tarfile.DIRTYPE, "read longname as dirtype")
@@ -393,13 +393,24 @@ class LongnameTest(ReadTest):
tarinfo = self.tar.getmember(longname)
offset = tarinfo.offset
self.tar.fileobj.seek(offset)
- fobj = StringIO.StringIO(self.tar.fileobj.read(1536))
+ fobj = StringIO.StringIO(self.tar.fileobj.read(3 * 512))
self.assertRaises(tarfile.ReadError, tarfile.open, name="foo.tar", fileobj=fobj)
+ def test_header_offset(self):
+ # Test if the start offset of the TarInfo object includes
+ # the preceding extended header.
+ longname = self.subdir + "/" + "123/" * 125 + "longname"
+ offset = self.tar.getmember(longname).offset
+ fobj = open(tarname)
+ fobj.seek(offset)
+ tarinfo = tarfile.TarInfo.frombuf(fobj.read(512))
+ self.assertEqual(tarinfo.type, self.longnametype)
+
class GNUReadTest(LongnameTest):
subdir = "gnu"
+ longnametype = tarfile.GNUTYPE_LONGNAME
def test_sparse_file(self):
tarinfo1 = self.tar.getmember("ustar/sparse")
@@ -410,26 +421,40 @@ class GNUReadTest(LongnameTest):
"sparse file extraction failed")
-class PaxReadTest(ReadTest):
+class PaxReadTest(LongnameTest):
subdir = "pax"
+ longnametype = tarfile.XHDTYPE
- def test_pax_globheaders(self):
+ def test_pax_global_headers(self):
tar = tarfile.open(tarname, encoding="iso8859-1")
+
tarinfo = tar.getmember("pax/regtype1")
self.assertEqual(tarinfo.uname, "foo")
self.assertEqual(tarinfo.gname, "bar")
- self.assertEqual(tarinfo.pax_headers.get("VENDOR.umlauts"), "ÄÖÜäöüß")
+ self.assertEqual(tarinfo.pax_headers.get("VENDOR.umlauts"), u"ÄÖÜäöüß")
tarinfo = tar.getmember("pax/regtype2")
self.assertEqual(tarinfo.uname, "")
self.assertEqual(tarinfo.gname, "bar")
- self.assertEqual(tarinfo.pax_headers.get("VENDOR.umlauts"), "ÄÖÜäöüß")
+ self.assertEqual(tarinfo.pax_headers.get("VENDOR.umlauts"), u"ÄÖÜäöüß")
tarinfo = tar.getmember("pax/regtype3")
self.assertEqual(tarinfo.uname, "tarfile")
self.assertEqual(tarinfo.gname, "tarfile")
- self.assertEqual(tarinfo.pax_headers.get("VENDOR.umlauts"), "ÄÖÜäöüß")
+ self.assertEqual(tarinfo.pax_headers.get("VENDOR.umlauts"), u"ÄÖÜäöüß")
+
+ def test_pax_number_fields(self):
+ # All following number fields are read from the pax header.
+ tar = tarfile.open(tarname, encoding="iso8859-1")
+ tarinfo = tar.getmember("pax/regtype4")
+ self.assertEqual(tarinfo.size, 7011)
+ self.assertEqual(tarinfo.uid, 123)
+ self.assertEqual(tarinfo.gid, 123)
+ self.assertEqual(tarinfo.mtime, 1041808783.0)
+ self.assertEqual(type(tarinfo.mtime), float)
+ self.assertEqual(float(tarinfo.pax_headers["atime"]), 1041808783.0)
+ self.assertEqual(float(tarinfo.pax_headers["ctime"]), 1041808783.0)
class WriteTest(unittest.TestCase):
@@ -700,68 +725,160 @@ class PaxWriteTest(GNUWriteTest):
n = tar.getmembers()[0].name
self.assert_(name == n, "PAX longname creation failed")
- def test_iso8859_15_filename(self):
- self._test_unicode_filename("iso8859-15")
+ def test_pax_global_header(self):
+ pax_headers = {
+ u"foo": u"bar",
+ u"uid": u"0",
+ u"mtime": u"1.23",
+ u"test": u"äöü",
+ u"äöü": u"test"}
+
+ tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT, \
+ pax_headers=pax_headers)
+ tar.addfile(tarfile.TarInfo("test"))
+ tar.close()
+
+ # Test if the global header was written correctly.
+ tar = tarfile.open(tmpname, encoding="iso8859-1")
+ self.assertEqual(tar.pax_headers, pax_headers)
+ self.assertEqual(tar.getmembers()[0].pax_headers, pax_headers)
+
+ # Test if all the fields are unicode.
+ for key, val in tar.pax_headers.items():
+ self.assert_(type(key) is unicode)
+ self.assert_(type(val) is unicode)
+ if key in tarfile.PAX_NUMBER_FIELDS:
+ try:
+ tarfile.PAX_NUMBER_FIELDS[key](val)
+ except (TypeError, ValueError):
+ self.fail("unable to convert pax header field")
+
+ def test_pax_extended_header(self):
+ # The fields from the pax header have priority over the
+ # TarInfo.
+ pax_headers = {u"path": u"foo", u"uid": u"123"}
+
+ tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT, encoding="iso8859-1")
+ t = tarfile.TarInfo()
+ t.name = u"äöü" # non-ASCII
+ t.uid = 8**8 # too large
+ t.pax_headers = pax_headers
+ tar.addfile(t)
+ tar.close()
+
+ tar = tarfile.open(tmpname, encoding="iso8859-1")
+ t = tar.getmembers()[0]
+ self.assertEqual(t.pax_headers, pax_headers)
+ self.assertEqual(t.name, "foo")
+ self.assertEqual(t.uid, 123)
+
+
+class UstarUnicodeTest(unittest.TestCase):
+ # All *UnicodeTests FIXME
+
+ format = tarfile.USTAR_FORMAT
+
+ def test_iso8859_1_filename(self):
+ self._test_unicode_filename("iso8859-1")
+
+ def test_utf7_filename(self):
+ self._test_unicode_filename("utf7")
def test_utf8_filename(self):
self._test_unicode_filename("utf8")
- def test_utf16_filename(self):
- self._test_unicode_filename("utf16")
-
def _test_unicode_filename(self, encoding):
- tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT)
- name = "\u20ac".encode(encoding) # Euro sign
- tar.encoding = encoding
+ tar = tarfile.open(tmpname, "w", format=self.format, encoding=encoding, errors="strict")
+ name = "äöü"
tar.addfile(tarfile.TarInfo(name))
tar.close()
tar = tarfile.open(tmpname, encoding=encoding)
- self.assertEqual(tar.getmembers()[0].name, name)
+ self.assert_(type(tar.getnames()[0]) is not unicode)
+ self.assertEqual(tar.getmembers()[0].name, name.encode(encoding))
tar.close()
def test_unicode_filename_error(self):
- # The euro sign filename cannot be translated to iso8859-1 encoding.
- tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT, encoding="utf8")
- name = "\u20ac".encode("utf8") # Euro sign
- tar.addfile(tarfile.TarInfo(name))
+ tar = tarfile.open(tmpname, "w", format=self.format, encoding="ascii", errors="strict")
+ tarinfo = tarfile.TarInfo()
+
+ tarinfo.name = "äöü"
+ if self.format == tarfile.PAX_FORMAT:
+ self.assertRaises(UnicodeError, tar.addfile, tarinfo)
+ else:
+ tar.addfile(tarinfo)
+
+ tarinfo.name = u"äöü"
+ self.assertRaises(UnicodeError, tar.addfile, tarinfo)
+
+ tarinfo.name = "foo"
+ tarinfo.uname = u"äöü"
+ self.assertRaises(UnicodeError, tar.addfile, tarinfo)
+
+ def test_unicode_argument(self):
+ tar = tarfile.open(tarname, "r", encoding="iso8859-1", errors="strict")
+ for t in tar:
+ self.assert_(type(t.name) is str)
+ self.assert_(type(t.linkname) is str)
+ self.assert_(type(t.uname) is str)
+ self.assert_(type(t.gname) is str)
tar.close()
- self.assertRaises(UnicodeError, tarfile.open, tmpname, encoding="iso8859-1")
+ def test_uname_unicode(self):
+ for name in (u"äöü", "äöü"):
+ t = tarfile.TarInfo("foo")
+ t.uname = name
+ t.gname = name
- def test_pax_headers(self):
- self._test_pax_headers({"foo": "bar", "uid": 0, "mtime": 1.23})
+ fobj = StringIO.StringIO()
+ tar = tarfile.open("foo.tar", mode="w", fileobj=fobj, format=self.format, encoding="iso8859-1")
+ tar.addfile(t)
+ tar.close()
+ fobj.seek(0)
- self._test_pax_headers({"euro": "\u20ac".encode("utf8")})
+ tar = tarfile.open("foo.tar", fileobj=fobj, encoding="iso8859-1")
+ t = tar.getmember("foo")
+ self.assertEqual(t.uname, "äöü")
+ self.assertEqual(t.gname, "äöü")
- self._test_pax_headers({"euro": "\u20ac"},
- {"euro": "\u20ac".encode("utf8")})
+class GNUUnicodeTest(UstarUnicodeTest):
- self._test_pax_headers({"\u20ac": "euro"},
- {"\u20ac".encode("utf8"): "euro"})
+ format = tarfile.GNU_FORMAT
- def _test_pax_headers(self, pax_headers, cmp_headers=None):
- if cmp_headers is None:
- cmp_headers = pax_headers
- tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT, \
- pax_headers=pax_headers, encoding="utf8")
- tar.addfile(tarfile.TarInfo("test"))
- tar.close()
+class PaxUnicodeTest(UstarUnicodeTest):
- tar = tarfile.open(tmpname, encoding="utf8")
- self.assertEqual(tar.pax_headers, cmp_headers)
+ format = tarfile.PAX_FORMAT
- def test_truncated_header(self):
- tar = tarfile.open(tmpname, "w", format=tarfile.PAX_FORMAT)
- tarinfo = tarfile.TarInfo("123/" * 126 + "longname")
- tar.addfile(tarinfo)
+ def _create_unicode_name(self, name):
+ tar = tarfile.open(tmpname, "w", format=self.format)
+ t = tarfile.TarInfo()
+ t.pax_headers["path"] = name
+ tar.addfile(t)
tar.close()
- # Simulate a premature EOF.
- open(tmpname, "rb+").truncate(1536)
- tar = tarfile.open(tmpname)
- self.assertEqual(tar.getmembers(), [])
+ def test_error_handlers(self):
+ # Test if the unicode error handlers work correctly for characters
+ # that cannot be expressed in a given encoding.
+ self._create_unicode_name(u"äöü")
+
+ for handler, name in (("utf-8", u"äöü".encode("utf8")),
+ ("replace", "???"), ("ignore", "")):
+ tar = tarfile.open(tmpname, format=self.format, encoding="ascii",
+ errors=handler)
+ self.assertEqual(tar.getnames()[0], name)
+
+ self.assertRaises(UnicodeError, tarfile.open, tmpname,
+ encoding="ascii", errors="strict")
+
+ def test_error_handler_utf8(self):
+ # Create a pathname that has one component representable using
+ # iso8859-1 and the other only in iso8859-15.
+ self._create_unicode_name(u"äöü/¤")
+
+ tar = tarfile.open(tmpname, format=self.format, encoding="iso8859-1",
+ errors="utf-8")
+ self.assertEqual(tar.getnames()[0], "äöü/" + u"¤".encode("utf8"))
class AppendTest(unittest.TestCase):
@@ -836,63 +953,58 @@ class LimitsTest(unittest.TestCase):
def test_ustar_limits(self):
# 100 char name
tarinfo = tarfile.TarInfo("0123456789" * 10)
- tarinfo.create_ustar_header()
+ tarinfo.tobuf(tarfile.USTAR_FORMAT)
# 101 char name that cannot be stored
tarinfo = tarfile.TarInfo("0123456789" * 10 + "0")
- self.assertRaises(ValueError, tarinfo.create_ustar_header)
+ self.assertRaises(ValueError, tarinfo.tobuf, tarfile.USTAR_FORMAT)
# 256 char name with a slash at pos 156
tarinfo = tarfile.TarInfo("123/" * 62 + "longname")
- tarinfo.create_ustar_header()
+ tarinfo.tobuf(tarfile.USTAR_FORMAT)
# 256 char name that cannot be stored
tarinfo = tarfile.TarInfo("1234567/" * 31 + "longname")
- self.assertRaises(ValueError, tarinfo.create_ustar_header)
+ self.assertRaises(ValueError, tarinfo.tobuf, tarfile.USTAR_FORMAT)
# 512 char name
tarinfo = tarfile.TarInfo("123/" * 126 + "longname")
- self.assertRaises(ValueError, tarinfo.create_ustar_header)
+ self.assertRaises(ValueError, tarinfo.tobuf, tarfile.USTAR_FORMAT)
# 512 char linkname
tarinfo = tarfile.TarInfo("longlink")
tarinfo.linkname = "123/" * 126 + "longname"
- self.assertRaises(ValueError, tarinfo.create_ustar_header)
+ self.assertRaises(ValueError, tarinfo.tobuf, tarfile.USTAR_FORMAT)
# uid > 8 digits
tarinfo = tarfile.TarInfo("name")
tarinfo.uid = 010000000
- self.assertRaises(ValueError, tarinfo.create_ustar_header)
+ self.assertRaises(ValueError, tarinfo.tobuf, tarfile.USTAR_FORMAT)
def test_gnu_limits(self):
tarinfo = tarfile.TarInfo("123/" * 126 + "longname")
- tarinfo.create_gnu_header()
+ tarinfo.tobuf(tarfile.GNU_FORMAT)
tarinfo = tarfile.TarInfo("longlink")
tarinfo.linkname = "123/" * 126 + "longname"
- tarinfo.create_gnu_header()
+ tarinfo.tobuf(tarfile.GNU_FORMAT)
# uid >= 256 ** 7
tarinfo = tarfile.TarInfo("name")
tarinfo.uid = 04000000000000000000
- self.assertRaises(ValueError, tarinfo.create_gnu_header)
+ self.assertRaises(ValueError, tarinfo.tobuf, tarfile.GNU_FORMAT)
def test_pax_limits(self):
- # A 256 char name that can be stored without an extended header.
- tarinfo = tarfile.TarInfo("123/" * 62 + "longname")
- self.assert_(len(tarinfo.create_pax_header("utf8")) == 512,
- "create_pax_header attached superfluous extended header")
-
tarinfo = tarfile.TarInfo("123/" * 126 + "longname")
- tarinfo.create_pax_header("utf8")
+ tarinfo.tobuf(tarfile.PAX_FORMAT)
tarinfo = tarfile.TarInfo("longlink")
tarinfo.linkname = "123/" * 126 + "longname"
- tarinfo.create_pax_header("utf8")
+ tarinfo.tobuf(tarfile.PAX_FORMAT)
tarinfo = tarfile.TarInfo("name")
tarinfo.uid = 04000000000000000000
- tarinfo.create_pax_header("utf8")
+ tarinfo.tobuf(tarfile.PAX_FORMAT)
class GzipMiscReadTest(MiscReadTest):
@@ -940,6 +1052,9 @@ def test_main():
StreamWriteTest,
GNUWriteTest,
PaxWriteTest,
+ UstarUnicodeTest,
+ GNUUnicodeTest,
+ PaxUnicodeTest,
AppendTest,
LimitsTest,
]