Message9222

Author	jeff.allen
Recipients	Dolda2000, fwierzbicki, jeff.allen, zyasoft
Date	2014-11-30.08:58:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1417337931.1.0.311051156971.issue2037@psf.upfronthosting.co.za>
In-reply-to

Content
Issue #2234 reveals another example of "character smuggling" in the Jython str: >>> from java.io import StringReader >>> from org.python.core import PyFileReader >>> r = StringReader(u"\u86c7\u541e\u8c61") >>> pfr = PyFileReader(r) >>> pfr.read(1) '\u86c7' >>> type(pfr.read(1)) <type 'str'> The cost of scanning every character during construction of a PyString nags a bit, but catching this kind of error (which it does) appears worthwhile. A constructor from byte[] (or ByteBuffer), that needs no checking, would be handy when the client already has bytes, and is presumably the future.

Issue #2234 reveals another example of "character smuggling" in the Jython str:
>>> from java.io import StringReader
>>> from org.python.core import PyFileReader
>>> r = StringReader(u"\u86c7\u541e\u8c61")
>>> pfr = PyFileReader(r)
>>> pfr.read(1)
'\u86c7'
>>> type(pfr.read(1))
<type 'str'>

The cost of scanning every character during construction of a PyString nags a bit, but catching this kind of error (which it does) appears worthwhile. A constructor from byte[] (or ByteBuffer), that needs no checking, would be handy when the client already has bytes, and is presumably the future.

History
Date	User	Action	Args
2014-11-30 08:58:51	jeff.allen	set	messageid: <1417337931.1.0.311051156971.issue2037@psf.upfronthosting.co.za>
2014-11-30 08:58:51	jeff.allen	set	recipients: + jeff.allen, fwierzbicki, zyasoft, Dolda2000
2014-11-30 08:58:50	jeff.allen	link	issue2037 messages
2014-11-30 08:58:50	jeff.allen	create