Message9222

Author jeff.allen
Recipients Dolda2000, fwierzbicki, jeff.allen, zyasoft
Date 2014-11-30.08:58:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1417337931.1.0.311051156971.issue2037@psf.upfronthosting.co.za>
In-reply-to
Content
Issue #2234 reveals another example of "character smuggling" in the Jython str:
>>> from java.io import StringReader
>>> from org.python.core import PyFileReader
>>> r = StringReader(u"\u86c7\u541e\u8c61")
>>> pfr = PyFileReader(r)
>>> pfr.read(1)
'\u86c7'
>>> type(pfr.read(1))
<type 'str'>

The cost of scanning every character during construction of a PyString nags a bit, but catching this kind of error (which it does) appears worthwhile. A constructor from byte[] (or ByteBuffer), that needs no checking, would be handy when the client already has bytes, and is presumably the future.
History
Date User Action Args
2014-11-30 08:58:51jeff.allensetmessageid: <1417337931.1.0.311051156971.issue2037@psf.upfronthosting.co.za>
2014-11-30 08:58:51jeff.allensetrecipients: + jeff.allen, fwierzbicki, zyasoft, Dolda2000
2014-11-30 08:58:50jeff.allenlinkissue2037 messages
2014-11-30 08:58:50jeff.allencreate