Type: Severity: normal
Components: Library Versions: Jython 2.7
Status: open Resolution: remind
Dependencies: Superseder:
Assigned To: Nosy List: amak, boisgera, fwierzbicki, jeff.allen, kellrott, pjac, pjenvey
Priority: normal Keywords:

Created on 2009-08-25.00:17:59 by kellrott, last changed 2018-03-18.07:25:11 by jeff.allen.

msg5063 (view) Author: Kyle (kellrott) Date: 2009-08-25.00:17:58
In xml.parsers.expat, missing Entity Parsing flags...

In Python:
>>> from xml.parsers import expat
>>> parser = expat.ParserCreate()
>>> parser.SetParamEntityParsing(expat.XML_PARAM_ENTITY_PARSING_ALWAYS)

In Jython:
>>> from xml.parsers import expat
>>> parser = expat.ParserCreate()
>>> parser.SetParamEntityParsing(expat.XML_PARAM_ENTITY_PARSING_ALWAYS)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'XMLParser' object has no attribute 'SetParamEntityParsing'

Failing to set this parameter also seems to have an effect on the
parsing of element decelerations in linked DTDs.
msg5066 (view) Author: Philip Jenvey (pjenvey) Date: 2009-08-26.04:28:58
Could you take a look at this S├ębastien?
msg5072 (view) Author: Kyle (kellrott) Date: 2009-08-26.22:57:36
Real world example: 
Is called in Bio.Entrez.Parser from BioPython.  It is tested in the
Tests/ unit test.  ( Code at
git:// ) 
It seems that if it is commented out in CPython then
parser.ExternalEntityRefHandler.ElementDeclHandler isn't called-back
during parsing.
msg5319 (view) Author: Peter (pjac) Date: 2009-11-23.17:31:46
As noted in msg5072 by Kyle, this Jython bug is an issue for Biopython's
Bio.Entrez module. There is a bug open in Biopython to track the issue:
msg5321 (view) Author: (boisgera) Date: 2009-11-23.20:31:05
Sorry, I missed the first post a few months ago, just noticed Peter's
last comment today.

Indeed the expat module included in Jython does not include the
'SetParamEntity' parser method. That alone could probably be forgiven,
as AFAICT, this method is implemented *but not documented* in CPython 2.5.

But -- wait -- this is getting worse :). I had a quick look at Biopython
and more specifically at Entrez to see why this feature was needed. I
discovered that DTD parsing is used *a lot* (for legitimate reasons) and
infortunately, the DTD parsing model of expat (say 'expat.model')
although being public API and all that has *not* been ported to Jython
so far. I fear this is a much bigger issue than the original.

So there is no quick fix. To begin with, the expat.model features should
all be implemented for Entrez to (even sort of) work ...
I used to have a look at this a while ago and stopped thinking about it
as it requires quite some serious work (I remember that expat DTD
parsing does more job than org.xml.sax that we rely on), it was not
required for the main use case (support ElementTree that does not care
about DTDs) and I was not sure that it was worth it (RelaxNG rocks,
nobody's using DTDs anyway, right ? ;)). 

I'll have a look at expat.model and see what I can do now that I know
that Biopython needs these features.


msg6852 (view) Author: Alan Kennedy (amak) Date: 2012-03-19.20:02:52
@boisgera: any updates?
msg7953 (view) Author: Peter (pjac) Date: 2013-03-22.14:02:02
Biopython migrated from Bugzilla to Redmine a while back, our issue on this is now here:
msg7954 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2013-03-22.16:03:31
Peter: thanks for the extra info!
msg11027 (view) Author: Peter (pjac) Date: 2017-01-19.17:15:04
Biopython tracking issue for this Jython limitation migrated to GitHub as
msg11834 (view) Author: Jeff Allen (jeff.allen) Date: 2018-03-18.07:25:10
Still present in 2.7.2a1
Date User Action Args
2018-03-18 07:25:11jeff.allensetnosy: + jeff.allen
messages: + msg11834
versions: - Jython 2.5
2017-01-19 17:15:04pjacsetmessages: + msg11027
2013-03-22 16:03:31fwierzbickisetmessages: + msg7954
2013-03-22 14:02:02pjacsetmessages: + msg7953
2013-02-19 19:13:30fwierzbickisetpriority: normal
resolution: remind
versions: + Jython 2.5, Jython 2.7, - 2.5.0
2012-03-19 20:02:52amaksetmessages: + msg6852
2010-04-17 12:03:56amaksetnosy: + amak
2009-11-23 20:31:06boisgerasetmessages: + msg5321
2009-11-23 17:31:46pjacsetnosy: + pjac
messages: + msg5319
2009-08-26 22:57:37kellrottsetmessages: + msg5072
2009-08-26 04:28:59pjenveysetnosy: + pjenvey, boisgera
messages: + msg5066
2009-08-25 15:11:17fwierzbickisetnosy: + fwierzbicki
2009-08-25 00:17:59kellrottcreate