Message11818

Author jeff.allen
Recipients fwierzbicki, jeff.allen, roskakori, sqxu, zyasoft
Date 2018-03-17.06:57:12
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1521269833.52.0.467229070634.issue1830@psf.upfronthosting.co.za>
In-reply-to
Content
With Jython 2.7.2a1 I get something like the opposite behaviour:

>>> sys.stderr.encoding, sys.stderr.errors
('ms936', 'backslashreplace')
>>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: 仮想イメージ「
>>> print u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"
仮想イメージ「
>>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: \xe4\xbb\xae\xe6\x83\xb3\xe3\x82¤\xe3\x83\xa1\xe3\x83\xbc\xe3\x82\xb8\xe3\x80\x8c

I wouldn't expect a good result from emitting utf-8 bytes onto an ms936 console, but this is not much better:
>>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('ms936')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: \x81\xa2\xcf\xeb\xa5¤\xa5á\xa9`\xa5\xb8\xa1\xb8

Since we can cope with:
>>> print u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('ms936')
仮想イメージ「
I think error message generation is defending itself too well against the decoding problems that replace the message you wanted with one about codecs.
History
Date User Action Args
2018-03-17 06:57:13jeff.allensetmessageid: <1521269833.52.0.467229070634.issue1830@psf.upfronthosting.co.za>
2018-03-17 06:57:13jeff.allensetrecipients: + jeff.allen, fwierzbicki, zyasoft, sqxu, roskakori
2018-03-17 06:57:13jeff.allenlinkissue1830 messages
2018-03-17 06:57:12jeff.allencreate