Message2022
Hi,
My understanding of PEP0263 is that the "coding: utf-8" in the first
line should influence the reading of .py files.
Alas, the PEP says: Python-Version: 2.3
whereas jython-2.2 is documented as corresponding to Python 2.2.
http://www.python.org/dev/peps/pep-0263/
So possibly mine is not a bug, but a feature request.
How can I use UTF-8 umlauts in my .py files with Jython?
# foo.py -*- coding: utf-8 -*- http://www.python.org/peps/pep-0263.html
inlineds = "zäöü!"
inlinedu = u"zäöü!"
explicits= "z\u00e4\u00f6\u00fc!"
explicitu= u"z\u00e4\u00f6\u00fc!"
all4=[inlineds,inlinedu,explicits,explicitu]
print all4, [len(s) for s in all4]
On a RedHat 5 system this produces:
['z\xC3\xA4\xC3\xB6\xC3\xBC!', u'z\xC3\xA4\xC3\xB6\xC3\xBC!', 'z\\u00e4\\u00f6\\u00fc!', u'z\xE4\xF6\xFC!'] [8, 8, 20, 5]
Jython 2.2 on java1.6.0_05-ea
uname -a
Linux foo.xy 2.6.9-55.0.9.ELsmp #1 SMP Tue Sep 25 02:16:15 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
LANG=de_DE@UTF-8
Debian produces expected results:
['z\xE4\xF6\xFC!', u'z\xE4\xF6\xFC!', 'z\\u00e4\\u00f6\\u00fc!', u'z\xE4\xF6\xFC!'] [5,5,20,5]
Jython 2.2 on java1.6.0_02
uname -a
Linux debianbasic 2.6.18-5-686 #1 ... i686 GNU/Linux
LANG=de_DE.UTF-8
However, even on the Debian system changing $LANG gives
LANG=C ./jython.sh foo.py
[u'z\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD!', u'z\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD!', 'z\\u00e4\\u00f6\\u00fc!', u'z\xE4\xF6\xFC!'] [8, 8, 20, 5]
All happens as if Jython reads the .py file using Java's default
encoding (which is influenced by $LANG but cannot directly be set AFAIK).
java.nio.charset.Charset.defaultCharset()
java.io.OutputStreamWriter(java.io.ByteArrayOutputStream()).getEncoding()
yields Java's default encoding.
I've now installed 2.2.1 and results change, although still
not satisfactorily. The Debian system now always yields:
['z\xC3\xA4\xC3\xB6\xC3\xBC!', u'z\xC3\xA4\xC3\xB6\xC3\xBC!', 'z\\u00e4\\u00f6\\u00fc!', u'z\xE4\xF6\xFC!'] [8, 8, 20, 5]
like Redhat before, regardless of $LANG.
Thus jython-2.2.1 seems to strictly assume ISO-8859-1 in .py files. At least 2.2.1 behaviour is consistent between the two
Redhat and Debian systems I tested.
Regards,
Jörg Höhle |
|
Date |
User |
Action |
Args |
2008-02-20 17:18:07 | admin | link | issue1840479 messages |
2008-02-20 17:18:07 | admin | create | |
|