Issue2176
Created on 2014-07-01.03:57:52 by zyasoft, last changed 2014-07-09.23:59:46 by zyasoft.
msg8855 (view) |
Author: Jim Baker (zyasoft) |
Date: 2014-07-01.03:57:51 |
|
It's trivial to reproduce this bug:
wget https://pypi.python.org/packages/source/P/PrettyTable/prettytable-0.7.2.tar.bz2#md5=760dc900590ac3c46736167e09fa463a
import tarfile
with tarfile.open(name='prettytable-0.7.2.tar.bz2', mode='r') as t:
t.extractall()
will reliably produce output files with chunks of data missing, as can be readily be verified with diff -r
Note that using the bz2 module to decompress produces a valid tar, so there's likely some interaction between streaming decompress and output of a given block to the untarred directory.
|
msg8862 (view) |
Author: Jim Baker (zyasoft) |
Date: 2014-07-01.22:46:14 |
|
bz2.BZ2File.read would not read all of the bytes requested. In particular, the tarfile module by default would request 16384 bytes at a time, but the read would return no more than 8192 bytes. Then the tarfile would seek forward, assuming it had read 16384 bytes, thereby missing chunks of files.
Fixed as of http://hg.python.org/jython/rev/91b39451dc89
|
|
Date |
User |
Action |
Args |
2014-07-09 23:59:46 | zyasoft | set | status: pending -> closed |
2014-07-01 22:46:15 | zyasoft | set | status: open -> pending resolution: fixed messages:
+ msg8862 |
2014-07-01 03:59:15 | zyasoft | set | title: bz2 compressed tarfile can see a corrupted read -> Extracting from bz2 compressed tarfile may result in missing chunks of files |
2014-07-01 03:57:52 | zyasoft | create | |
|