Issue1510

classification
Title: minidom is not parsing comment information correctly
Type: behaviour Severity: critical
Components: Library Versions: 2.5.1
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: amak Nosy List: amak, ssteiner, william.bernardet
Priority: Keywords:

Created on 2009-12-01.08:40:50 by william.bernardet, last changed 2010-04-16.17:40:22 by amak.

Messages
msg5335 (view) Author: William (william.bernardet) Date: 2009-12-01.08:40:49
This is with activepython 2.5.2.2:
ActivePython 2.5.2.2 (ActiveState Software Inc.) based on
Python 2.5.2 (r252:60911, Mar 27 2008, 17:57:18) [MSC v.1310 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.dom.minidom import parseString
>>> d = parseString("<tag><!-- comment --></tag>")
>>> d.toxml()
u'<?xml version="1.0" ?><tag><!-- comment --></tag>'

Same script with jython:

Jython 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54)
[Java HotSpot(TM) Client VM (Sun Microsystems Inc.)] on java1.6.0_11
Type "help", "copyright", "credits" or "license" for more information.
>>> from xml.dom.minidom import parseString
>>> d = parseString("<tag><!-- comment --></tag>")
>>> d.toxml()
u'<?xml version="1.0" ?>\n<tag/>'

The outcome of jython is different from the CPython, the comment tag has
been removed from the DOM tree.
msg5378 (view) Author: simon steiner (ssteiner) Date: 2009-12-14.14:36:29
--- drv_javasax.py	Mon Dec 14 14:31:59 2009
+++ newdrv_javasax.py	Mon Dec 14 14:29:22 2009
@@ -25,6 +25,7 @@
     from org.python.core import FilelikeInputStream
     from org.xml.sax.helpers import XMLReaderFactory
     from org.xml import sax as javasax
+    from org.xml.sax.ext import LexicalHandler
 except ImportError:
     raise _exceptions.SAXReaderNotAvailable("SAX is not on the
classpath", None)
 
@@ -120,7 +121,7 @@
         return self.sysId
 
 # --- JavaSAXParser
-class JavaSAXParser(xmlreader.XMLReader, javasax.ContentHandler):
+class JavaSAXParser(xmlreader.XMLReader, javasax.ContentHandler,
LexicalHandler):
     "SAX driver for the Java SAX parsers."
 
     def __init__(self, jdriver = None):
@@ -129,6 +130,7 @@
         self._parser.setFeature(feature_namespaces, 0)
         self._parser.setFeature(feature_namespace_prefixes, 0)
         self._parser.setContentHandler(self)
+       
self._parser.setProperty("http://xml.org/sax/properties/lexical-handler", self)
         self._nsattrs = AttributesNSImpl()
         self._attrs = AttributesImpl()
         self.setEntityResolver(self.getEntityResolver())
@@ -205,6 +207,9 @@
 
     def processingInstruction(self, target, data):
         self._cont_handler.processingInstruction(target, data)
+
+    def comment(self, char, start, len):
+        self._cont_handler.comment(str(String(char, start, len)))
 
 class AttributesImpl:
     def __init__(self, attrs = None):
msg5722 (view) Author: Alan Kennedy (amak) Date: 2010-04-16.17:40:22
Fix checked in at revision 7027.

Thanks to William for the report and Simon Steiner for the patch.

(Sorry Simon, patch incorrectly attributed to William in the checkin comment)
History
Date User Action Args
2010-04-16 17:40:22amaksetstatus: open -> closed
assignee: amak
resolution: fixed
messages: + msg5722
nosy: + amak
2009-12-14 14:36:30ssteinersetmessages: + msg5378
2009-12-02 07:43:22william.bernardetsetseverity: normal -> critical
2009-12-01 11:29:37ssteinersetnosy: + ssteiner
2009-12-01 08:40:50william.bernardetcreate