Message11480

Author jeff.allen
Recipients jeff.allen
Date 2017-07-18.20:57:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1500411437.26.0.565395982237.issue2608@psf.upfronthosting.co.za>
In-reply-to
Content
I have reproduced this with the host name: 先知_MICAH. In fact we have two places to fix, at least.

> dist\bin\jython
Jython 2.7.1 (default:0df7adb1b397, Jul 18 2017, 21:24:53)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_60
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, platform
>>> os.uname()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
        at org.python.core.PyString.<init>(PyString.java:57)
        at org.python.core.PyString.<init>(PyString.java:70)
        at org.python.core.PyString.<init>(PyString.java:74)
        at org.python.core.Py.newString(Py.java:647)
        at org.python.modules.posix.PosixModule.uname(PosixModule.java:1169)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Cannot create PyString with non-byte value
>>> platform.uname()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\platform.py", line 1212, in uname
    node = _node()
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\platform.py", line 990, in _node
    return socket.gethostname()
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\platform.py", line 990, in _node
    return socket.gethostname()
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 382, in handle_exception
    return method_or_function(*args, **kwargs)
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 382, in handle_exception
    return method_or_function(*args, **kwargs)
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 382, in handle_exception
    return method_or_function(*args, **kwargs)
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 1875, in gethostname
    return str(InetAddress.getLocalHost().getHostName())
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
>>> '\xcf\xc8\xd6\xaa_MICAH'.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: invalid data
>>> '\xcf\xc8\xd6\xaa_MICAH'.decode('cp936')
u'\u5148\u77e5_MICAH'

It is interesting that platform.uname chokes down in _socket.py. It makes me thing we should look suspiciously at wherever we str-ingify a Java String, to see if we should be FS-encoding that (or something else). I attempted this throughout our Java source, but not the Python.


For interest, the behaviour of CPython is:

------------------------------------------------ 2
> python
Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, platform
>>> platform.uname()
('Windows', '\xcf\xc8\xd6\xaa_MICAH', '10', '10.0.14393', 'AMD64', 'AMD64 Family 16 Model 5 Stepping 2, AuthenticAMD')
>>> print platform.uname()[1].decode(sys.getfilesystemencoding())
先知_MICAH

So in Python 2, the host name should appear as bytes in file-system encoding.

------------------------------------------------ 3
> python
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.uname()
uname_result(system='Windows', node='先知_MICAH', release='10', version='10.0.14393', machine='AMD64', processor='AMD64 Family 16 Model 5 Stepping 2, AuthenticAMD')
>>> platform.uname().node
'先知_MICAH'
History
Date User Action Args
2017-07-18 20:57:17jeff.allensetmessageid: <1500411437.26.0.565395982237.issue2608@psf.upfronthosting.co.za>
2017-07-18 20:57:17jeff.allensetrecipients: + jeff.allen
2017-07-18 20:57:17jeff.allenlinkissue2608 messages
2017-07-18 20:57:16jeff.allencreate