Issue1746957

classification
Title: Weird 'uu' prefix for unicode
Type: Severity: normal
Components: None Versions:
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: cgroves, pekka.klarck
Priority: normal Keywords:

Created on 2007-07-03.08:01:13 by pekka.klarck, last changed 2007-07-08.00:09:17 by cgroves.

Messages
msg1697 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-07-03.08:01:13
At least when decoding invalid utf-8 byte to unicode you get out an unicode object which has a weird 'uu' prefix. This is illustrated by the example below. 

Jython 2.2rc1 on java1.5.0_11
Type "copyright", "credits" or "license" for more information.
>>> u = '\xFF'.decode('utf-8', 'replace')
>>> u
uu'\uFFFD'
>>> type(u)
<type 'unicode'>
>>> print u
?

There's also some discussion about this at Jython users mailing list in the beginning of July 2007 as a sub thread of "character encoding issues" thread. Following link ought to point to my mail about this.

http://sourceforge.net/mailarchive/message.php?msg_name=f5f747f10707020428t479239cdsa139465fffdfc87%40mail.gmail.com
msg1698 (view) Author: Charlie Groves (cgroves) Date: 2007-07-08.00:09:17
Fixed in r3285.
History
Date User Action Args
2007-07-03 08:01:13pekka.klarckcreate