Issue1268

classification
Title: SAX parsers wants to load external DTDs, causing an exception
Type: behaviour Severity: normal
Components: Library Versions: 2.5b1
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: fwierzbicki Nosy List: fdb, fwierzbicki, lukasz.heldt, ssteiner
Priority: normal Keywords:

Created on 2009-03-06.11:12:18 by fdb, last changed 2009-12-05.12:34:52 by fdb.

Files
File name Uploaded Description Edit Remove
do-not-load-external-dtds.patch.txt fdb, 2009-03-06.11:12:17 Patch to turn off loading of external DTDs
test_parseString.py fdb, 2009-03-06.11:14:59
do-not-load-external-dtds.patch.txt fdb, 2009-03-06.11:16:38
Messages
msg4179 (view) Author: Frederik De Bleser (fdb) Date: 2009-03-06.11:12:16
The following patch prevents SAX from loading external DTDs. These
caused Jython to crash when parsing XML documents with Doctypes (such as
XHTML or SVG).
msg4180 (view) Author: Frederik De Bleser (fdb) Date: 2009-03-06.11:14:59
Included is a simple testcase that fails in the latest SVN versions.
Applying the patch fixes the error. The following error is displayed:

Traceback (most recent call last):
  File "test_parseString.py", line 6, in <module>
    xml = parseString(data)
  File "/Users/fdb/Java/jython-svn/dist/Lib/xml/dom/minidom.py", line
1933, in parseString
    return _do_pulldom_parse(pulldom.parseString, (string,),
  File "/Users/fdb/Java/jython-svn/dist/Lib/xml/dom/minidom.py", line
1908, in _do_pulldom_parse
    toktype, rootNode = events.getEvent()
  File "/Users/fdb/Java/jython-svn/dist/Lib/xml/dom/pulldom.py", line
275, in _slurp
    self.parser.parse(self.stream)
  File
"/Users/fdb/Java/jython-svn/dist/Lib/xml/sax/drivers2/drv_javasax.py",
line 143, in parse
    self._parser.parse(JyInputSourceWrapper(source))
  File
"/Users/fdb/Java/jython-svn/dist/Lib/xml/sax/drivers2/drv_javasax.py",
line 92, in resolveEntity
    return JyInputSourceWrapper(self._resolver.resolveEntity(pubId, sysId))
  File
"/Users/fdb/Java/jython-svn/dist/Lib/xml/sax/drivers2/drv_javasax.py",
line 77, in __init__
    if source.getByteStream():
AttributeError: 'unicode' object has no attribute 'getByteStream'
msg4181 (view) Author: Frederik De Bleser (fdb) Date: 2009-03-06.11:16:38
The original patch contained an oversight where the actual changed line
was commented out for testing purposes. The new patch fixes the problem.
msg5323 (view) Author: Lukasz Heldt (lukasz.heldt) Date: 2009-11-24.13:06:13
I have come across the same bug when trying to actually use the external
DTD functionality. After attaching following line into my XML file:

<!DOCTYPE common SYSTEM "common.dtd">

I got the error mentioned by Frederik. Luckily there is an easy fix for
this issue. Following line in JyInputSourceWrapper needs to be changed from:
        if isinstance(source, str):
            javasax.InputSource.__init__(self, source)
to:
        if isinstance(source, str) or isinstance(source, unicode):
            javasax.InputSource.__init__(self, source)
msg5347 (view) Author: simon steiner (ssteiner) Date: 2009-12-04.16:03:56
After Lukasz Heldt fix i get (i dont get this on cpython):

java.io.FileNotFoundException: java.io.FileNotFoundException:
C:\sysdef_1_4_0.dtd (The system can
not find the file specified)
msg5348 (view) Author: simon steiner (ssteiner) Date: 2009-12-04.16:27:21
I added the patch and its ok do-not-load-external-dtds.patch.txt
msg5352 (view) Author: Frederik De Bleser (fdb) Date: 2009-12-05.12:34:52
Turning off DTD validation seems the cleanest solution and is conform
with the Python parsing API, which does not validate external DTDs.
History
Date User Action Args
2009-12-05 12:34:52fdbsetmessages: + msg5352
2009-12-04 16:27:21ssteinersetmessages: + msg5348
2009-12-04 16:03:57ssteinersetmessages: + msg5347
2009-11-24 13:06:13lukasz.heldtsetnosy: + lukasz.heldt
messages: + msg5323
2009-10-22 07:34:07ssteinersetnosy: + ssteiner
2009-03-14 14:59:20fwierzbickisetpriority: normal
assignee: fwierzbicki
2009-03-06 18:31:37fwierzbickisetnosy: + fwierzbicki
2009-03-06 11:16:38fdbsetfiles: + do-not-load-external-dtds.patch.txt
messages: + msg4181
2009-03-06 11:15:00fdbsetfiles: + test_parseString.py
messages: + msg4180
2009-03-06 11:12:18fdbcreate