Message12746

Author pekka.klarck
Recipients jeff.allen, pekka.klarck
Date 2019-11-04.11:19:46
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1572866387.24.0.938679469644.issue2820@roundup.psfhosted.org>
In-reply-to
Content
I did some further investigation and compared how Jython and CPython behave on Linux and Windows.

CPython:

1. When you set `PYTHONPATH=föö` and run Python, `sys.path` contains `föö` in bytes so that it's encoded using the system encoding (UTF-8 on Linux, Windows-1252 on by Windows machine).

2. Programmatically it's possible to set new items to `sys.path` using Unicode strings.

3. Also system encoded byte strings work when set programmatically. This is understandable because the interpreter itself uses that format when setting `PYTHONPATH` externally.

4. `sys.path` entries in bytes using some other than the system encoding seem to be ignored. At least they don't cause any problems.


Jython 2.7.2b2:

1. When you set `JYTHONPATH=föö`, and run Jython, `sys.path` contains `föö` as Unicode string regardless the operating system. This is different to CPython, but I don't think it matters because also CPython accepts Unicode entries in `sys.path`.

2. Programmatically it's possible to use Unicode strings. This is same as with CPython.

3. Byte strings work only if you use UTF-8, regardless the operating system. This is different to CPython and, in my opinion, it would be better to support system encoded byte strings similarly as CPython. In practice this only affects Windows because other OSes generally use UTF-8. On Windows it's also possible to use Unicode strings so at least there's a workaround.

4. Byte strings that aren't UTF-8 cause UnicodeDecodeError. It occurs with non-existing modules when such entries are at the end of `sys.path`, but it occurs *always* if these entries are in the beginning. This is, in my opinion, pretty severe. With the point 3. above, this means that a system encoded `sys.path` entry that works with CPython can break *all* imports with Jython.
History
Date User Action Args
2019-11-04 11:19:47pekka.klarcksetmessageid: <1572866387.24.0.938679469644.issue2820@roundup.psfhosted.org>
2019-11-04 11:19:47pekka.klarcksetrecipients: + pekka.klarck, jeff.allen
2019-11-04 11:19:47pekka.klarcklinkissue2820 messages
2019-11-04 11:19:46pekka.klarckcreate