Message1697
At least when decoding invalid utf-8 byte to unicode you get out an unicode object which has a weird 'uu' prefix. This is illustrated by the example below.
Jython 2.2rc1 on java1.5.0_11
Type "copyright", "credits" or "license" for more information.
>>> u = '\xFF'.decode('utf-8', 'replace')
>>> u
uu'\uFFFD'
>>> type(u)
<type 'unicode'>
>>> print u
?
There's also some discussion about this at Jython users mailing list in the beginning of July 2007 as a sub thread of "character encoding issues" thread. Following link ought to point to my mail about this.
http://sourceforge.net/mailarchive/message.php?msg_name=f5f747f10707020428t479239cdsa139465fffdfc87%40mail.gmail.com
|
|
| Date |
User |
Action |
Args |
| 2008-02-20 17:17:52 | admin | link | issue1746957 messages |
| 2008-02-20 17:17:52 | admin | create | |
|