Issue1768

classification
Title: sax.parse doesn't handler attributes with name 'id' correctly
Type: Severity: normal
Components: Versions:
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: amak Nosy List: amak, pekka.klarck, samuel337
Priority: Keywords:

Created on 2011-07-11.22:45:33 by pekka.klarck, last changed 2011-10-30.13:15:19 by amak.

Files
File name Uploaded Description Edit Remove
saxbug.py pekka.klarck, 2011-07-11.22:45:32
Messages
msg6566 (view) Author: Pekka Klärck (pekka.klarck) Date: 2011-07-11.22:45:32
When using sax.parse, attributes passed to startElement are broken if an element has an attribute with name 'id'. This can be demonstrated with the attached script. Below is the output I get with Jython 2.5.2:

$ jython saxbug.py 
attr  : idx
value : 1
attr  : (u'i', u'd')
value :
Traceback (most recent call last):
  File "saxbug.py", line 11, in <module>
    sax.parse(StringIO('<tag id="1"/>'), Handler())
  File "/home/peke/Prog/jython2.5.2/Lib/xml/sax/__init__.py", line 34, in parse
    parser.parse(source)
  File "/home/peke/Prog/jython2.5.2/Lib/xml/sax/drivers2/drv_javasax.py", line 146, in parse
    self._parser.parse(JyInputSourceWrapper(source))
  File "/home/peke/Prog/jython2.5.2/Lib/xml/sax/drivers2/drv_javasax.py", line 187, in startElement
    self._cont_handler.startElement(qname, self._attrs)
  File "saxbug.py", line 8, in startElement
    print 'value :', attrs.values()[0]
  File "/home/peke/Prog/jython2.5.2/Lib/xml/sax/drivers2/drv_javasax.py", line 311, in values
    return map(self.getValue, self.getNames())
  File "/home/peke/Prog/jython2.5.2/Lib/xml/sax/drivers2/drv_javasax.py", line 266, in getValue
    value = self._attrs.getValue(_makeJavaNsTuple(name))
TypeError: getValue(): 1st arg can't be coerced to String, int
msg6644 (view) Author: (samuel337) Date: 2011-09-11.05:58:12
I just encountered this bug as well. It actually affects all attribute names that are exactly 2 characters long. 

The dirty workaround is to use - 

    attrs._attrs.getValue('id')

instead to get values, and remember that that will throw a KeyError exception if it does not exist.

The bug is rather simple - in Lib/xml/sax/drivers2/drv_javasax.py, in the function _fixTuple, on line 241, it checks the length of the first argument, nsTuple, and if it is a length of 2, assumes it is a tuple of (namespace, localName) and proceeds to unpack it. 

Unfortunately it doesn't check the type, and other methods (specifically _makeJavaNsTuple and _makePythonNsTuple) pass in a string instead, therefore a string of length 2 is assumed to be a (namespace, localName) tuple as a string is a sequence type as well.

The solution is to enact a type check that specifically excludes strings, e.g.

def _fixTuple(nsTuple, frm, to):
    # NOTE the isinstance check that excludes strings
    if not isinstance(nsTuple, str) and len(nsTuple) == 2:
        nsUri, localName = nsTuple
        if nsUri == frm:
            nsUri = to
        return (nsUri, localName)
    return nsTuple

The alternative is to check for a tuple, but that then you would exclude other sequence and sequence-like types.

I hope this is enough information to perform the one-liner fix; I can provide a diff file if necessary.
msg6688 (view) Author: Alan Kennedy (amak) Date: 2011-10-30.13:15:19
Fix checked in at http://hg.python.org/jython/rev/936bd1b132eb
History
Date User Action Args
2011-10-30 13:15:19amaksetstatus: open -> closed
resolution: fixed
messages: + msg6688
2011-10-30 12:21:55amaksetassignee: amak
nosy: + amak
2011-09-11 05:58:12samuel337setnosy: + samuel337
messages: + msg6644
2011-07-11 22:45:33pekka.klarckcreate