Author jeff.allen
Recipients jeff.allen, stefan.richthofer, zyasoft
Date 2017-11-21.23:22:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
We now have fairly complete support for default encoding when mixing bytes and unicode, thanks to these two change sets:

People who call sys.setdefaultencoding should now have an experience closer to CPython. I'm hoping this will help with #2633 in these circumstances.

Reading test_unicode_jy.DefaultDecodingTestCase gives a pretty good account of where Jython diverges from CPython, by doing more. The test passes for CPython 2.7.14, thanks to a few if-statements testing for Jython. I've made what I think is a reasonable compromise between CPython behaviour and consistency in the comparisons/equality.

One loose end: str.find(unicode) returns an index in the encoded string, not a byte offset in the original. I think this is wrong, but is what CPython does.

Oh, and I may have broken shadowstring ... is there a test? I'd quite like to modify startswith.
Date User Action Args
2017-11-21 23:22:51jeff.allensetmessageid: <>
2017-11-21 23:22:51jeff.allensetrecipients: + jeff.allen, zyasoft, stefan.richthofer
2017-11-21 23:22:50jeff.allenlinkissue2638 messages
2017-11-21 23:22:50jeff.allencreate