Message8618
Got it. The parser uses a Java codec, so a literal string has already been decoded from the console by the Java x-mswin-936 codec. But a literal string should contain the bytes equivalent to it in the input encoding. So the parser has to be reverse itself, and is trying to do that with the (non-existent) Python codec. But using the Java codec is more respectable, and it fixes the hang on input.
>dist\bin\jython -Dpython.console=
Jython 2.7b3+ (default:6cee6fef06f0+, Jun 9 2014, 23:22:52)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_51
Type "help", "copyright", "credits" or "license" for more information.
>>> "xx"
'xx'
>>> "畫蛇添足"
'\xae\x8b\xc9\xdf\xcc\xed\xd7\xe3'
>>> u"畫蛇添足"
u'\u756b\u86c7\u6dfb\u8db3'
>>> exit()
This doesn't work with the default JLineConsole as that seems to have no idea about multibyte characters.
Output is still failing, as that really does need the codecs from #1066.
I'll push this small change after tests, and then think how to avoid the non-Python name "x-mswin-936".
On the wrapping issue, Jim: if someone defined a codec in Python, then used it a the source encoding, it would be necessary to be able to create a Java codec from it, since the parser has to use it as the decoding in a Reader. In the present design, that is. |
|
Date |
User |
Action |
Args |
2014-06-09 23:03:26 | jeff.allen | set | messageid: <1402355006.53.0.238796594888.issue2123@psf.upfronthosting.co.za> |
2014-06-09 23:03:26 | jeff.allen | set | recipients:
+ jeff.allen, zyasoft, rpan |
2014-06-09 23:03:26 | jeff.allen | link | issue2123 messages |
2014-06-09 23:03:26 | jeff.allen | create | |
|