Title: Unable to import optparse with non-ASCII character in LANGUAGE
Type: Severity: normal
Components: Library Versions: Jython 2.7
Status: open Resolution: accepted
Dependencies: Superseder:
Assigned To: Nosy List: fwierzbicki, gsnedders, zyasoft
Priority: low Keywords:

Created on 2013-04-07.15:44:11 by gsnedders, last changed 2015-03-10.18:48:57 by zyasoft.

msg7989 (view) Author: (gsnedders) Date: 2013-04-07.15:44:10
Trying 2.7b1, I get:

gsnedders@vanveen:~$ ~/local/jython2.7b1/jython 
Jython 2.7b1 (default:ac42d59644e9, Feb 9 2013, 15:24:52) 
[OpenJDK 64-Bit Server VM (Oracle Corporation)] on java1.7.0_17
Type "help", "copyright", "credits" or "license" for more information.
>>> import optparse
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 418, in <module>
    _builtin_cvt = { "int" : (_parse_int, _("integer")),
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 567, in gettext
    return dgettext(_current_domain, message)
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 530, in dgettext
    t = translation(domain, _localedirs.get(domain, None),
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 530, in dgettext
    t = translation(domain, _localedirs.get(domain, None),
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 466, in translation
    mofiles = find(domain, localedir, languages, all=1)
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 438, in find
    for nelang in _expand_lang(lang):
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 133, in _expand_lang
    locale = normalize(locale)
  File "/home/gsnedders/local/jython2.7b1/Lib/", line 358, in normalize
    fullname = localename.translate(_ascii_lower_map)
TypeError: translate() only works for 8-bit character strings

Oddly, a quickly minimized version of it works fine:

>>> _ascii_lower_map = ''.join(
...     chr(x + 32 if x >= ord('A') and x <= ord('Z') else x)
...     for x in range(256)
... )
>>> "a".translate(_ascii_lower_map)

localename and _ascii_lower_map are both of type str, so the error message seems bogus.

Playing about further I found I had a non-ASCII character in LANGUAGE in my environment (how on earth did *that* get there!?), so the following fails, giving the above exception:

gsnedders@vanveen:~$ export LANGUAGE="$(echo -n -e '\xffen_GB')"
gsnedders@vanveen:~$ ~/local/jython2.7b1/jython -m optparse

An odd setup, but probably something that shouldn't break just because of a non-ASCII character. Probably some underlying bug around translate, however.
msg8701 (view) Author: Jim Baker (zyasoft) Date: 2014-06-19.04:41:25
It's an unlikely scenario as gsnedders mentions, but we should fix. I think this is one of the cases where the underlying Java environment is providing us unicode - see what appears to be a somewhat related bug #1841
msg9613 (view) Author: Jim Baker (zyasoft) Date: 2015-03-10.18:48:57
Now fails differently:

UnicodeEncodeError: 'ascii' codec can't encode character u'\ufffd' in position 0: ordinal not in range(128)

This is because we are passing through arguments and environment variables as unicode as necessary. See
Date User Action Args
2015-03-10 18:48:57zyasoftsetmessages: + msg9613
2014-06-19 04:41:25zyasoftsetpriority: low
resolution: accepted
messages: + msg8701
nosy: + zyasoft
2013-04-07 18:57:27fwierzbickisetnosy: + fwierzbicki
2013-04-07 15:44:11gsnedderscreate