Message9278

Author zyasoft
Recipients Arfrever, jeff.allen, zyasoft
Date 2014-12-30.18:31:04
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1419964264.61.0.289657775165.issue2239@psf.upfronthosting.co.za>
In-reply-to
Content
@Jeff,

It's pretty straightforward - all paths are Unicode in Java. Apparently so are environment variables and their values. So if we look at PosixModule, it's trying to replicate what's specified (more or less) in https://docs.python.org/2/library/os.html#os.listdir by intercepting the returned String and making a PyString/PyUnicode as appropriate.

I believe the right choice for us is if the path is PyString, to only return a PyString if ascii, otherwise PyUnicode, because we don't actually support encoded strings anyway.

Likewise, we have a similar problem in os.environ, as supported by PosixModule.getEnviron:

    private static PyObject getEnviron() {
        PyObject environ = new PyDictionary();
        Map<String, String> env;
        try {
            env = System.getenv();
        } catch (SecurityException se) {
            return environ;
        }
        for (Map.Entry<String, String> entry : env.entrySet()) {
            environ.__setitem__(Py.newString(entry.getKey()), Py.newString(entry.getValue()));
        }
        return environ;
    }

https://github.com/jythontools/jython/blob/master/src/org/python/modules/posix/PosixModule.java#L896

Note that Python 3 separates out os.environ and os.environb:
https://docs.python.org/3/library/os.html#os.environ

So when I run on Python 3, os.environ has this entry:
'PWD': '/Users/jbaker/test/unicode/首页'

whereas on Python 2.7:
'PWD': '/Users/jbaker/test/unicode/\xe9\xa6\x96\xe9\xa1\xb5'

Compare Java: 
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html#getenv()

And using Jython 2.5 in this same directory:
>>> System.getenv().get("PWD")
u'/Users/jbaker/test/unicode/\u9996\u9875'

And it's all related to that curious entity of surrogateescape
History
Date User Action Args
2014-12-30 18:31:04zyasoftsetmessageid: <1419964264.61.0.289657775165.issue2239@psf.upfronthosting.co.za>
2014-12-30 18:31:04zyasoftsetrecipients: + zyasoft, jeff.allen, Arfrever
2014-12-30 18:31:04zyasoftlinkissue2239 messages
2014-12-30 18:31:04zyasoftcreate