Issue1830

classification
Title: assert and print behave differently in rendering non-ascii text
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7, Jython 2.5
Milestone:
process
Status: open Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: fwierzbicki, jeff.allen, roskakori, sqxu, zyasoft
Priority: normal Keywords:

Created on 2012-01-06.05:45:12 by sqxu, last changed 2018-03-17.06:57:13 by jeff.allen.

Messages
msg6754 (view) Author: Sheng qiang xu (sqxu) Date: 2012-01-06.05:45:11
I am using Jython 2.5.1 and find a strange behavior for assert and print. Pls see the following lines.

1.  assert False == True, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"

   Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
   Error in sys.excepthook:
   UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-6: ordinal not in range(128)

2. print u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"

   output:  仮想イメージ「

3. assert False == True, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('utf8')
   Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
  AssertionError: 仮想イメージ「

2 and 3 can show the message correctly, but 1 cannot. Do assert and print have different mechanism to display the message?
BTW, my sys.stdout.encoding is UTF-8.
msg6755 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2012-01-06.17:57:21
Confirmed on trunk. In fact the trouble is that exceptions can't handle unicode in the message.
msg8702 (view) Author: Jim Baker (zyasoft) Date: 2014-06-19.04:46:27
Interestingly, this does not work on CPython 2.7, but does on CPython 3.4.

Should be an easy fix.

Target beta 4
msg8889 (view) Author: Thomas Aglassinger (roskakori) Date: 2014-07-26.08:45:08
For the record: in C-Python 2.7.6 under Ubuntu 14.04 with OpenJDK 1.7.0_55 you get an AssertionError without a message:

  >>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  AssertionError

In Python 3.4.0 you get an AssertionError with unicode characters.
msg11818 (view) Author: Jeff Allen (jeff.allen) Date: 2018-03-17.06:57:12
With Jython 2.7.2a1 I get something like the opposite behaviour:

>>> sys.stderr.encoding, sys.stderr.errors
('ms936', 'backslashreplace')
>>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: 仮想イメージ「
>>> print u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c"
仮想イメージ「
>>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: \xe4\xbb\xae\xe6\x83\xb3\xe3\x82¤\xe3\x83\xa1\xe3\x83\xbc\xe3\x82\xb8\xe3\x80\x8c

I wouldn't expect a good result from emitting utf-8 bytes onto an ms936 console, but this is not much better:
>>> assert False, u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('ms936')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError: \x81\xa2\xcf\xeb\xa5¤\xa5á\xa9`\xa5\xb8\xa1\xb8

Since we can cope with:
>>> print u"\u4eee\u60f3\u30a4\u30e1\u30fc\u30b8\u300c".encode('ms936')
仮想イメージ「
I think error message generation is defending itself too well against the decoding problems that replace the message you wanted with one about codecs.
History
Date User Action Args
2018-03-17 06:57:13jeff.allensetnosy: + jeff.allen
title: assert and print behave differently to display the message in jython 2.5.1 -> assert and print behave differently in rendering non-ascii text
messages: + msg11818
versions: + Jython 2.7
2014-10-06 03:26:46zyasoftsetpriority: high -> normal
2014-07-26 08:45:08roskakorisetnosy: + roskakori
messages: + msg8889
2014-06-19 04:46:27zyasoftsetnosy: + zyasoft
messages: + msg8702
2013-02-25 18:41:20fwierzbickisetversions: + Jython 2.5, - 2.5.1
2012-01-06 17:58:17fwierzbickisetcomponents: + Core, - Any
2012-01-06 17:58:06fwierzbickisettype: behaviour
2012-01-06 17:57:21fwierzbickisetpriority: high
resolution: accepted
messages: + msg6755
nosy: + fwierzbicki
2012-01-06 05:45:12sqxucreate