Message11227

Author jeff.allen
Recipients jeff.allen
Date 2017-03-14.07:48:32
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1489477712.73.0.00566124019931.issue2571@psf.upfronthosting.co.za>
In-reply-to
Content
Aha! Python 3 agrees with Java 8:

>>> b'abc\x80\x80\xc1\xc4def'.decode('big5', 'replace')
'abc\ufffd\ufffd\u8b10def'

This will be the result of this change set:
https://hg.python.org/cpython/rev/16cbd84de848
and this issue:
http://bugs.python.org/issue12016
where, in a last-minute change of mind, CPython decided not to back-port the fix to 2.7 and 3.2.

However, nothing in the Python documentation seems to guarantee one or other behaviour. Given we have good reasons for using the Java codec, I'll give us a custom test that is either sensitive to version, or accepts the two.
History
Date User Action Args
2017-03-14 07:48:32jeff.allensetmessageid: <1489477712.73.0.00566124019931.issue2571@psf.upfronthosting.co.za>
2017-03-14 07:48:32jeff.allensetrecipients: + jeff.allen
2017-03-14 07:48:32jeff.allenlinkissue2571 messages
2017-03-14 07:48:32jeff.allencreate