Message8044

Author santa4nt
Recipients oberstet, santa4nt
Date 2013-06-12.18:42:21
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1371062541.67.0.893121731222.issue2061@psf.upfronthosting.co.za>
In-reply-to
Content
A simplified, minimal code to reproduce using the json module's base parts:


In CPython:

  Python 2.7.4 (default, Apr 19 2013, 18:28:01) 
  [GCC 4.7.3] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> from json.encoder import encode_basestring
  >>> s = '\xce\xba\xe1\xbd\xb9\xcf\x83\xce\xbc\xce\xb5\xed\xa0\x80edited'
  >>> o = s.decode('utf-8')
  >>> o
  u'\u03ba\u1f79\u03c3\u03bc\u03b5\ud800edited'
  >>> encode_basestring(o)
  u'"\u03ba\u1f79\u03c3\u03bc\u03b5\ud800edited"'


In Jython:

  Jython 2.7b1+ (default:3f971d6907b7+, Jun 12 2013, 11:30:15) 
  [Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_21
  Type "help", "copyright", "credits" or "license" for more information.
  >>> s = '\xce\xba\xe1\xbd\xb9\xcf\x83\xce\xbc\xce\xb5\xed\xa0\x80edited'
  >>> s.decode('utf-8')
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/santa/Code/jython/dist/Lib/encodings/utf_8.py", line 16, in decode
      return codecs.utf_8_decode(input, errors, True)
  UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 11-13: illegal encoding
History
Date User Action Args
2013-06-12 18:42:21santa4ntsetmessageid: <1371062541.67.0.893121731222.issue2061@psf.upfronthosting.co.za>
2013-06-12 18:42:21santa4ntsetrecipients: + santa4nt, oberstet
2013-06-12 18:42:21santa4ntlinkissue2061 messages
2013-06-12 18:42:21santa4ntcreate