Message4625
This line of code from the html5lib trunk file inputstream.py:
invalid_unicode_re =
re.compile(u"[\u0001-\u0008\u000B\u000E-\u001F\u007F-\u009F\uD800-\uDFFF\uFDD0-\uFDEF\uFFFE\uFFFF\U0001FFFE\U0001FFFF\U0002FFFE\U0002FFFF\U0003FFFE\U0003FFFF\U0004FFFE\U0004FFFF\U0\
005FFFE\U0005FFFF\U0006FFFE\U0006FFFF\U0007FFFE\U0007FFFF\U0008FFFE\U0008FFFF\U0009FFFE\U0009FFFF\U000AFFFE\U000AFFFF\U000BFFFE\U000BFFFF\U000CFFFE\U000CFFFF\U000DFFFE\U000DFFFF\U000EFFFE\U000EFFFF\U00\
0FFFFE\U000FFFFF\U0010FFFE\U0010FFFF]")
won't compile under Jython 2.5b3:
Sorry: UnicodeDecodeError: ('unicodeescape',
'u"[\\u0001-\\u0008\\u000B\\u000E-\\u001F\\u007F-\\u009F\\uD800-\\uDFFF\\uFDD0-\\uFDEF\\uFFFE\\uFFFF\\U0001FFFE\\U0001FFFF\\U0002FFFE\\U0002FFFF\\U0003FFFE\\U0003FFFF\\U0004FFFE\\U0004FFFF\\U0005FFFE\\U0005FFFF\\U0006FFFE\\U0006FFFF\\U0007FFFE\\U0007FFFF\\U0008FFFE\\U0008FFFF\\U0009FFFE\\U0009FFFF\\U000AFFFE\\U000AFFFF\\U000BFFFE\\U000BFFFF\\U000CFFFE\\U000CFFFF\\U000DFFFE\\U000DFFFF\\U000EFFFE\\U000EFFFF\\U000FFFFE\\U000FFFFF\\U0010FFFE\\U0010FFFF]"',
48, 55, 'illegal Unicode character')
It looks like Jython (via Java) is enforcing valid unicode in the
literal while standard Python is not. |
|
Date |
User |
Action |
Args |
2009-05-01 13:12:40 | dmbaggett | set | recipients:
+ dmbaggett |
2009-05-01 13:12:39 | dmbaggett | set | messageid: <1241183559.86.0.907259040796.issue1335@psf.upfronthosting.co.za> |
2009-05-01 13:12:38 | dmbaggett | link | issue1335 messages |
2009-05-01 13:12:37 | dmbaggett | create | |
|