Issue2295

classification
Title: Importation of modules with str names with non-ASCII characters fails
Type: Severity: normal
Components: Core Versions: Jython 2.7
Milestone:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, jeff.allen, zyasoft
Priority: normal Keywords:

Created on 2015-03-19.12:48:40 by Arfrever, last changed 2018-03-16.08:35:00 by jeff.allen.

Messages
msg9677 (view) Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) Date: 2015-03-19.12:48:40
Importation of modules with non-ASCII characters works in CPython 2.7 when module name is specified as a str object (e.g. "ćśź"), but not unicode object (e.g. u"ćśź").

$ touch /tmp/ćśź.py
$ cat test.py
# coding: utf-8
import sys
sys.path.append("/tmp")
module = __import__("ćśź")
print(module)
$ python2.7 test.py
<module 'ćśź' from '/tmp/ćśź.py'>
$ jython2.7 test.py
Traceback (most recent call last):
  File "test.py", line 4, in <module>
    module = __import__("ćśź")
ImportError: No module named ćśź
msg11808 (view) Author: Jeff Allen (jeff.allen) Date: 2018-03-16.08:34:59
I poked around this in 2.7.2a1 and it still fails. However, I think a fix is within reach.

>>> print os.listdir('iss2295')
['tell$py.class', 'tell.py', '\xc4\x87\xc5\x9b\xc5\xba.py', '\xe5\x9b\xb0\xe9\x9a\xbe.py']
>>> for f in os.listdir('iss2295'):
...     name, ext = f.split('.')
...     if ext=='py': print name.decode('utf-8')
...
tell
困难

We need to be on the encoding that __import__ receives as an argument. Here it is a bytes object, so what is the encoding, that of the source or the FS-encoding? A unicode should be acceptable (and is).

>>> for f in os.listdir('iss2295'):
...     name, ext = f.split('.')
...     if ext=='py': print __import__(name.decode('utf-8'))
...
<module 'tell' from 'iss2295\tell.py'>

... but then it fails. The unicode argument, provides the true file name and the module is loaded. However, the print statement fails with:

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
History
Date User Action Args
2018-03-16 08:35:00jeff.allensetpriority: normal
nosy: + jeff.allen
messages: + msg11808
2015-03-19 12:50:39Arfreversettitle: Importation of modules with non-ASCII characters fails -> Importation of modules with str names with non-ASCII characters fails
2015-03-19 12:48:40Arfrevercreate