Issue1808214

classification
Title: Unicode strings, 2.2.1RC2
Type: Severity: normal
Components: Core Versions:
Milestone:
process
Status: closed Resolution: invalid
Dependencies: Superseder:
Assigned To: Nosy List: cgroves, sysrmn
Priority: normal Keywords:

Created on 2007-10-05.17:11:55 by sysrmn, last changed 2007-10-05.17:26:40 by cgroves.

Messages
msg1965 (view) Author: Marcel Nepveu (sysrmn) Date: 2007-10-05.17:11:55
The following works with Jython 2.2 and 2.1, but is broken in 2.2.1:

Jython 2.2.1rc2 on java1.5.0_07
Type "copyright", "credits" or "license" for more information.
print u"Num\u00e9ro requis."
Traceback (innermost last):
   File "<console>", line 1, in ?
UnicodeError: ascii encoding error: ordinal not in range(128)

It also breaks Reportlab. While printing an invoice containing  
accented characters, the following message is displayed:
   ...
   File "N:\DEV\TEST\reportlab-1.19.jar\reportlab/platypus/ 
tables.py", line 357, in _calc_height
UnicodeError: ascii encoding error: ordinal not in range(128)

Lines 356 and 357 are:
                             if t is not StringType:
                                 v = v is None and '' or str(v)
msg1966 (view) Author: Charlie Groves (cgroves) Date: 2007-10-05.17:26:40
It was actually a bug that this worked in earlier versions.  Jython would blindly dump a unicode objects out to print when it should've run it through sys.defaultencoding to turn it into a str first.  This doesn't work for me on CPython either:

Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print u"Num\u00e9ro requis."
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 3: ordinal not in range(128)
>>>

It may work for you there because your defaultencoding is different.  You can do the same thing for Jython.  Edit your site.py to set defaultencoding to utf-8 or whatever it is in CPython.

I imagine Reportlab is expecting to have a defaultencoding that can handle unicode, or that you would handle the encoding on your own with an explicit encode call beforehand.
History
Date User Action Args
2007-10-05 17:11:55sysrmncreate