Message8640
Fixed in http://hg.python.org/jython/rev/6c718e5e9ae9 to the extent possible by using java.nio.charset.Charset. Here are the codecs not available, more or less what Philip identified in msg3880:
euc_jis_2004
euc_jisx0213
hz
iso2022_jp_1
iso2022_jp_2004
iso2022_jp_3
iso2022_jp_ext
shift_jis_2004
hz could potentially be supported by preprocessing - it's a way of encoding GB2312 as 2 7-bit bytes, with escaping provided by ~{...~}. It's possible that ICU4J could potentially help as well.
We also potentially gain other encodings as well, such as cp1047, as needed by http://bugs.jython.org/issue550200, supporting EBCDIC.
The one remaining issue I see here is that there are a couple of minor corner cases around errors for trailing bytes where it is not final. It's not clear to me what can really be done here in this case, since it seems to be a property of the decoder; at the very least it's something that's picked up by our unit tests, so it's visible. |
|
Date |
User |
Action |
Args |
2014-06-14 00:46:29 | zyasoft | set | messageid: <1402706789.78.0.66327667621.issue1066@psf.upfronthosting.co.za> |
2014-06-14 00:46:29 | zyasoft | set | recipients:
+ zyasoft, cgroves, fwierzbicki, pjenvey, yyamano, jeff.allen |
2014-06-14 00:46:29 | zyasoft | link | issue1066 messages |
2014-06-14 00:46:28 | zyasoft | create | |
|