Issue #11224: Improved sparse file read support (r85916) introduced a

regression in _FileInFile which is used in file-like objects returned by TarFile.extractfile(). The inefficient design of the _FileInFile.read() method causes various dramatic side-effects and errors: - The data segment of a file member is read completely into memory every(!) time a small block is accessed. This is not only slow but may cause unexpected MemoryErrors with very large files. - Reading members from compressed tar archives is even slower because of the excessive backwards seeking which is done when the same data segment is read over and over again. - As a backwards seek on a TarFile opened in stream mode is not possible, using extractfile() fails with a StreamError.
author: Lars Gustäbel <lars@gustaebel.de> 2011-02-23 11:42:22 (GMT)
committer: Lars Gustäbel <lars@gustaebel.de> 2011-02-23 11:42:22 (GMT)
commit: dd071045e776e1c3e8cf6750a2fd1d0958bf19b3 (patch)
tree: 3afb00727522ffb897602ec1ae5d2a9ccfd3dce4 /Lib/tarfile.py
parent: 3eeee833915b96a15c60eafc317bb6822af2084c (diff)
download: cpython-dd071045e776e1c3e8cf6750a2fd1d0958bf19b3.zip
cpython-dd071045e776e1c3e8cf6750a2fd1d0958bf19b3.tar.gz
cpython-dd071045e776e1c3e8cf6750a2fd1d0958bf19b3.tar.bz2
1 files changed, 2 insertions, 3 deletions
diff --git a/Lib/tarfile.py b/Lib/tarfile.py
index e3747e9..0f9d1da 100644
--- a/Lib/tarfile.py
+++ b/Lib/tarfile.py
@@ -760,9 +760,8 @@ class _FileInFile(object):
                         self.map_index = 0
             length = min(size, stop - self.position)
             if data:
-                self.fileobj.seek(offset)
-                block = self.fileobj.read(stop - start)
-                buf += block[self.position - start:self.position + length]
+                self.fileobj.seek(offset + (self.position - start))
+                buf += self.fileobj.read(length)
             else:
                 buf += NUL * length
             size -= length
author	Lars Gustäbel <lars@gustaebel.de>	2011-02-23 11:42:22 (GMT)
committer	Lars Gustäbel <lars@gustaebel.de>	2011-02-23 11:42:22 (GMT)
commit	dd071045e776e1c3e8cf6750a2fd1d0958bf19b3 (patch)
tree	3afb00727522ffb897602ec1ae5d2a9ccfd3dce4 /Lib/tarfile.py
parent	3eeee833915b96a15c60eafc317bb6822af2084c (diff)
download	cpython-dd071045e776e1c3e8cf6750a2fd1d0958bf19b3.zip cpython-dd071045e776e1c3e8cf6750a2fd1d0958bf19b3.tar.gz cpython-dd071045e776e1c3e8cf6750a2fd1d0958bf19b3.tar.bz2