Message8620

Author jeff.allen
Recipients jeff.allen, rpan, zyasoft
Date 2014-06-10.06:48:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1402382893.0.0.908553633541.issue2123@psf.upfronthosting.co.za>
In-reply-to
Content
On output, this happens:
>>> p = u"Java 蛇"
>>> p
u'Java \u86c7'
>>> print p
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
LookupError: unknown encoding 'x-mswin-936'

If I had an acceptable encoding name here, it would still fail because there is no suitable codec (awaiting #1066). The following works (after the fix) because the bytes in the string are the bytes that the console expects:
>>> print "Java 蛇"
Java 蛇
There is still a difficulty when the code comes from a file. The bytes will be in the encoding used for the file, but this may not match the console. I think this is a Python 2k behaviour, best addressed by using Unicode, which brings us back to #1066.
History
Date User Action Args
2014-06-10 06:48:13jeff.allensetmessageid: <1402382893.0.0.908553633541.issue2123@psf.upfronthosting.co.za>
2014-06-10 06:48:13jeff.allensetrecipients: + jeff.allen, zyasoft, rpan
2014-06-10 06:48:12jeff.allenlinkissue2123 messages
2014-06-10 06:48:12jeff.allencreate