Message9227

Author jeff.allen
Recipients jeff.allen
Date 2014-12-11.23:47:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1418341665.37.0.668127371237.issue2234@psf.upfronthosting.co.za>
In-reply-to
Content
Ok, the behaviour of exec is exactly what PEP-263 requires:
>>> u
u'# Test encoding line\n# coding= iso-8859-15\nprint u"caf\xe9 du c\u0153ur"\n\n'
>>> exec(u)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 0
SyntaxError: encoding declaration in Unicode string

PythonInterpreter doesn't quite mirror this:

>>> from org.python.util import PythonInterpreter
>>> pi = PythonInterpreter()
>>> pi.exec(u)
café du c?ur

The reason is that both calls end up at PythonInterpreter.exec(String), which then treats the String as bytes.

A bunch of this code accepts either a String or a Reader, but in a few places quietly assumes char is byte. Things that accept an InputStream are unambiguous that one is dealing with bytes, but it's not entirely clear how the encoding is remembered and used. I'm extending test_pythoninterpreter_jy to a range of non-ascii cases.
History
Date User Action Args
2014-12-11 23:47:45jeff.allensetmessageid: <1418341665.37.0.668127371237.issue2234@psf.upfronthosting.co.za>
2014-12-11 23:47:45jeff.allensetrecipients: + jeff.allen
2014-12-11 23:47:45jeff.allenlinkissue2234 messages
2014-12-11 23:47:44jeff.allencreate