Message1457
When you join a list of unicode strings with a normal string (e.g. "x = ' '.join([u'a',u'b'])") the returned item is a normal string instead of a unicode string. If the string used in join is a unicode string then also the outcome is unicode. If you use the string returned in the former case as a unicode string later in your code (e.g. "unicode(x)") you get an UnicodeError if original unicode strings contained non-ascii characters. In CPython you get a unicode string in both cases as you would expect.
See examples using Jython 2.2b1 and Python 2.4.3 below. In Jython 2.2a1 things work differently due to http://jython.org/bugs/1538001 that's fixed in beta.
Jython 2.2b1 on java1.5.0_10 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>> ul = [u'Hyv\u00E4',u'Good']
>>>
>>> x = ' '.join(ul)
>>> type(x)
<type 'str'>
>>> unicode(x)
Traceback (innermost last):
File "<console>", line 1, in ?
UnicodeError: ascii decoding error: ordinal not in range(128)
>>>
>>> y = u' '.join(ul)
>>> type(y)
<type 'unicode'>
>>> unicode(y)
u'Hyv\xE4 Good'
>>>
Python 2.4.3 (#1, May 18 2006, 07:40:45)
[GCC 3.3.3 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> ul = [u'Hyv\u00E4',u'Good']
>>>
>>> x = ' '.join(ul)
>>> type(x)
<type 'unicode'>
>>> unicode(x)
u'Hyv\xe4 Good'
>>>
>>> y = u' '.join(ul)
>>> type(y)
<type 'unicode'>
>>> unicode(y)
u'Hyv\xe4 Good'
>>>
|
|
Date |
User |
Action |
Args |
2008-02-20 17:17:43 | admin | link | issue1659819 messages |
2008-02-20 17:17:43 | admin | create | |
|