Message8081

Author omatz
Recipients omatz, santa4nt
Date 2013-08-14.08:39:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1376469566.46.0.0908001103803.issue2073@psf.upfronthosting.co.za>
In-reply-to
Content
The attached program reproduces the problem.  I paste its output. let us see what happens to the umlauts in the display.
Mind the difference between the two lines prefixed by java-hex:
the first reveals the error: the java String has two characters for the german o-umlaut, obtained by padding each of the two bytes from its utf-8-sequence with a zero byte.


Executing:
# -*- coding: utf-8 -*-
from org.python.core import codecs
import Issue2073Main
codecs.setDefaultEncoding('utf-8')
print 'schön'
print 'python-hex:', ':'.join(x.encode('hex') for x in 'schön')
Issue2073Main.javaPrint('schön')
--------------------------------------------
schön
python-hex: 73:63:68:c3:b6:6e
javaPrint: schön
java-hex: 73:63:68:c3:b6:6e
--------------------------------------------
expected output:
javaPrint: schön
java-hex: 73:63:68:f6:6e
History
Date User Action Args
2013-08-14 08:39:26omatzsetmessageid: <1376469566.46.0.0908001103803.issue2073@psf.upfronthosting.co.za>
2013-08-14 08:39:26omatzsetrecipients: + omatz, santa4nt
2013-08-14 08:39:26omatzlinkissue2073 messages
2013-08-14 08:39:25omatzcreate