Issue1445

classification
Title: Module bz2 is missing
Type: behaviour Severity: major
Components: Library Versions: 2.5.0
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, ArneBab, alex.gronholm, crankycoder, fwierzbicki, ita, jupake, pjdm, pjenvey, r_walter, spiritmech
Priority: low Keywords:

Created on 2009-08-21.12:50:07 by pjdm, last changed 2013-01-26.03:31:59 by alex.gronholm.

Files
File name Uploaded Description Edit Remove
bz2.tar.bz2 pjenvey, 2010-02-02.19:55:41
jythonx-bits.tar.bz2 pjenvey, 2010-02-02.19:56:14
bz2_v2.tar.bz2 pjenvey, 2010-03-09.05:07:35
Messages
msg5061 (view) Author: Peter Mayne (pjdm) Date: 2009-08-21.12:50:07
Jython does not include the bz2 module.
msg5170 (view) Author: Steve Lewis (spiritmech) Date: 2009-09-15.00:25:55
I'm looking into this one. I'll update this ticket if for some reason I
get in too deep, but I'm trying to follow the
PortingPythonModulesToJython wiki page.
msg5189 (view) Author: Steve Lewis (spiritmech) Date: 2009-09-24.00:53:21
Hey Peter,

I'm just curious what need you had for bz2. Is another library using it?
The reason I ask is because Philip Jenvey mentioned that the hard part
is handling streaming, which the Java library from Apache doesn't support.

I'm not sure if the current test cases test for streaming or not, I'm
going to check into it. If they don't, maybe your need for bz2 can help
me think of a use case beyond "stream random data into/out of bz2".

Although if I have to create those use cases I can do that, too.
msg5193 (view) Author: Peter Mayne (pjdm) Date: 2009-09-24.13:12:16
Mercurial uses bz2 for bundles and archives. See "hg help bundle" and
"hg help archive".
msg5482 (view) Author: ita (ita) Date: 2010-02-02.14:40:08
The module bz2 is still missing, and opening tar.bz2 files fails with:
"file could not be opened successfully"
msg5484 (view) Author: ita (ita) Date: 2010-02-02.19:38:46
It seems there are classes for bzip2 support in ant. Would it be possible to import them in the jython tree?

http://www.kohsuke.org/bzip2/
msg5485 (view) Author: Philip Jenvey (pjenvey) Date: 2010-02-02.19:53:14
Here's a jython-dev thread with a little more detail

http://markmail.org/message/fwpt4nlvgiskgfur

Basically we can implement bz2 with the Apache library but it does not support streaming of decompression (which Python's bz2.BZ2Decompressor does). It should support encoding via a stream however (bz2.BZ2Compressor).

This would be better than nothing as I think most uses of the bz2 module are via bz2.BZ2File, which doesn't stream

Another alternative is a former Jython contributor tried implementing a port of the libbz2 C code to Java for his Jythonx project. I don't know what state this code is in though
msg5486 (view) Author: Philip Jenvey (pjenvey) Date: 2010-02-02.19:55:41
Furthermore here's my attempt at writing a bz2 module with the Apache lib from a couple years ago. I think I updated it slightly to work with the Apache commons bz2 compressor, which is pretty much the same as http://www.kohsuke.org/bzip2/ under a different name

It'd be a good starting point
msg5487 (view) Author: Philip Jenvey (pjenvey) Date: 2010-02-02.19:56:14
Attaching the other jythonx bz2 implemention
msg5519 (view) Author: ita (ita) Date: 2010-02-11.18:23:19
Please have a look at the following patch, which enables jython to read tar.bz2 files:
http://freehackers.org/~tnagy/bz2.diff

* is it permitted to import the apache classes like this?
* is a PyFile wrapper ok or is it more recommendable to use a class (in a python file or in a java file?)
* are there gotchas with using PyFile (memory leaks, file remaining open on win32, ...) ?

Note: this bug tracker does not allow attachments. It would be fine if the error message did not eat the message (not nice!)
msg5563 (view) Author: Philip Jenvey (pjenvey) Date: 2010-03-07.23:31:19
ita - 

I was hoping the bz2.tar.bz2 I posted could be used as a starting point. It actually has a lot of BZ2File implemented and at least some of work for the stream objects (I think BZ2Compressor is done). Though it's in Java code which may not be as easy to deal with. Have you looked at it?

There's a test_bz2 that we need to be passing at least most of before we ship a bz2 module
msg5564 (view) Author: Philip Jenvey (pjenvey) Date: 2010-03-09.05:07:35
Here's an updated version of the code that compiles against the current trunk. There's no BZ2Decompressor code included, fully supporting it will be the tricky bit
msg5842 (view) Author: Gili (cowwoc) Date: 2010-06-27.08:19:02
It might be worth while to simply port another implementation from scratch. This parallel implementation is supposed to be very fast: http://compression.ca/pbzip2/

It consists of 3000 lines of code.
msg6062 (view) Author: Arne Babenhauserheide (ArneBab) Date: 2010-09-11.23:50:09
Would pyflate be a possible base? http://www.paul.sladen.org/projects/pyflate/

How I got here: I’d love to have Mercurial support in Jython, because then we could ship hg infocalypse (and Mercurial) in freenet and use it as a decentral and anonymous wikiengine. 

All the code is in place (called fniki), except for integrating Mercurial in Java via Jython (the freenet maintainer doesn’t want non-java dependencies, so we have to get Mercurial running on Java).
msg6756 (view) Author: Roland Walter (r_walter) Date: 2012-01-12.21:23:21
There is now a new java implementation of bzip2:
http://code.google.com/p/jbzip2/

The interface of this implementation looks better than the one of Apache ant, but it is java 6 and so only for Jython 2.6 suitable.
msg6757 (view) Author: Roland Walter (r_walter) Date: 2012-01-12.21:58:23
What in bz2_v2 in the class BZ2File is missing, is the seek method. This is essential for the support of the tarfile module. My simple implementation idea is, for reading mode and the whence SEEK_SET: Do nothing when position given is the actual, when argument is greater than actual position do so many dummy reads as the difference, when the argument is smaller, reopen the underlying java stream and do dummy reads as often as the argument says.

This is similar for SEEK_CUR.

SEEK_END is in my opinion not supportable.

Another essential point for the tarfile support is, do not throw exceptions in the constructor, when the arguments are correct. You must defer that to the I/O operations like read() and seek(). When they are called raise an IOError("invalid data stream"). This is needed for the auto detection of the used compression algorithm when you give only "r" as mode to the tarfile.open().
msg7488 (view) Author: Julian Kennedy (jupake) Date: 2012-10-26.17:43:34
Hi guys. 

I have implemented this module. I have gotten all tests to pass. I will prepare a patch this weekend and upload it here.
msg7489 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-10-26.21:39:10
Nice - looking forward to it!
msg7523 (view) Author: Arne Babenhauserheide (ArneBab) Date: 2012-11-18.18:10:08
Were you able to get the patch working?
msg7587 (view) Author: Alex Grönholm (alex.gronholm) Date: 2013-01-26.03:31:58
Added Julian Kennedy's version that supports all the necessary functionality, including streaming (de)compression.
History
Date User Action Args
2013-01-26 03:31:59alex.gronholmsetstatus: open -> closed
resolution: fixed
messages: + msg7587
nosy: + alex.gronholm
2012-11-18 19:26:59cowwocsetnosy: - cowwoc
2012-11-18 18:10:09ArneBabsetmessages: + msg7523
2012-10-26 21:39:10fwierzbickisetmessages: + msg7489
2012-10-26 17:43:35jupakesetnosy: + jupake
messages: + msg7488
2012-10-17 21:20:07fwierzbickisetnosy: + fwierzbicki
2012-09-21 18:33:19Arfreversetnosy: + Arfrever
2012-01-12 21:58:24r_waltersetmessages: + msg6757
2012-01-12 21:23:22r_waltersetnosy: + r_walter
messages: + msg6756
2010-09-11 23:50:10ArneBabsetnosy: + ArneBab
messages: + msg6062
2010-08-22 22:38:16zyasoftsetpriority: low
2010-07-15 13:13:22crankycodersetnosy: + crankycoder
2010-06-27 08:19:03cowwocsetnosy: + cowwoc
messages: + msg5842
2010-03-09 05:07:36pjenveysetfiles: + bz2_v2.tar.bz2
messages: + msg5564
2010-03-07 23:31:20pjenveysetmessages: + msg5563
2010-02-11 18:23:20itasetmessages: + msg5519
2010-02-02 19:56:15pjenveysetfiles: + jythonx-bits.tar.bz2
messages: + msg5487
2010-02-02 19:55:41pjenveysetfiles: + bz2.tar.bz2
messages: + msg5486
2010-02-02 19:53:14pjenveysetmessages: + msg5485
2010-02-02 19:38:46itasetmessages: + msg5484
2010-02-02 14:40:09itasetnosy: + ita
messages: + msg5482
2009-09-26 02:42:08pjenveysetnosy: + pjenvey
2009-09-24 13:12:16pjdmsetmessages: + msg5193
2009-09-24 00:53:22spiritmechsetmessages: + msg5189
2009-09-15 00:25:56spiritmechsetnosy: + spiritmech
messages: + msg5170
2009-08-21 12:50:07pjdmcreate