Message3175

Author pekka.klarck
Recipients pekka.klarck
Date 2008-05-02.20:00:12
SpamBayes Score 0.00015232395
Marked as misclassified No
Message-id <1209758412.53.0.151611192402.issue1032@psf.upfronthosting.co.za>
In-reply-to
Content
Jython 2.2.1 on java1.6.0_03
Type "copyright", "credits" or "license" for more information.
>>> x = "%s" % u"\u00E4"
Traceback (innermost last):
 File "<console>", line 1, in ?
UnicodeError: ascii encoding error: ordinal not in range(128)

'\u00E4' is latin-1 character 'รค'. I can workaround this problem like below:

>>> from org.python.core import codecs
>>> codecs.setDefaultEncoding('iso-8859-1')
>>> x = "%s" % u"\u00E4"
>>> assert x == u"\u00E4"

But if I now try to use e.g. Cyrillic characters I got an UnicodeError
again:

>>> x = "%s" % u"\u0420"
Traceback (innermost last):
 File "<console>", line 1, in ?
UnicodeError: latin-1 encoding error: ordinal not in range(256)

Note that using syntax u'%s' doesn't help:

Jython 2.2.1 on java1.6.0
Type "copyright", "credits" or "license" for more information.
>>> u'%s' % u'\u00E4'        
Traceback (innermost last):
  File "<console>", line 1, in ?
UnicodeError: ascii decoding error: ordinal not in range(128)


All this works both on CPython (I've tested only with 2.5) and Jython
2.2 (tested both on Linux and Windows). With Jython 2.2 there's no
need to setDefaultEncoding.
History
Date User Action Args
2008-05-02 20:00:12pekka.klarcksetspambayes_score: 0.000152324 -> 0.00015232395
recipients: + pekka.klarck
2008-05-02 20:00:12pekka.klarcksetspambayes_score: 0.000152324 -> 0.000152324
messageid: <1209758412.53.0.151611192402.issue1032@psf.upfronthosting.co.za>
2008-05-02 20:00:12pekka.klarcklinkissue1032 messages
2008-05-02 20:00:12pekka.klarckcreate