Issue2608
Created on 2017-07-18.07:47:03 by jeff.allen, last changed 2017-09-05.20:51:20 by zyasoft.
| Messages | |||
|---|---|---|---|
| msg11478 (view) | Author: Jeff Allen (jeff.allen) | Date: 2017-07-18.07:47:02 | |
Within https://github.com/jythontools/jython/issues/83 the user encounters an error that I tentatively identify as our failure to handle his host name correctly here: https://hg.python.org/jython/file/tip/src/org/python/modules/posix/PosixModule.java#l1174 . I think we should be FS-encoding the String(s). I have not reproduced this myself yet. (Need to change the host name to include a character >255.) |
|||
| msg11480 (view) | Author: Jeff Allen (jeff.allen) | Date: 2017-07-18.20:57:16 | |
I have reproduced this with the host name: 先知_MICAH. In fact we have two places to fix, at least.
> dist\bin\jython
Jython 2.7.1 (default:0df7adb1b397, Jul 18 2017, 21:24:53)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_60
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, platform
>>> os.uname()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
at org.python.core.PyString.<init>(PyString.java:57)
at org.python.core.PyString.<init>(PyString.java:70)
at org.python.core.PyString.<init>(PyString.java:74)
at org.python.core.Py.newString(Py.java:647)
at org.python.modules.posix.PosixModule.uname(PosixModule.java:1169)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Cannot create PyString with non-byte value
>>> platform.uname()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\platform.py", line 1212, in uname
node = _node()
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\platform.py", line 990, in _node
return socket.gethostname()
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\platform.py", line 990, in _node
return socket.gethostname()
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 382, in handle_exception
return method_or_function(*args, **kwargs)
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 382, in handle_exception
return method_or_function(*args, **kwargs)
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 382, in handle_exception
return method_or_function(*args, **kwargs)
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py", line 1875, in gethostname
return str(InetAddress.getLocalHost().getHostName())
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
>>> '\xcf\xc8\xd6\xaa_MICAH'.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: invalid data
>>> '\xcf\xc8\xd6\xaa_MICAH'.decode('cp936')
u'\u5148\u77e5_MICAH'
It is interesting that platform.uname chokes down in _socket.py. It makes me thing we should look suspiciously at wherever we str-ingify a Java String, to see if we should be FS-encoding that (or something else). I attempted this throughout our Java source, but not the Python.
For interest, the behaviour of CPython is:
------------------------------------------------ 2
> python
Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:53:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys, platform
>>> platform.uname()
('Windows', '\xcf\xc8\xd6\xaa_MICAH', '10', '10.0.14393', 'AMD64', 'AMD64 Family 16 Model 5 Stepping 2, AuthenticAMD')
>>> print platform.uname()[1].decode(sys.getfilesystemencoding())
先知_MICAH
So in Python 2, the host name should appear as bytes in file-system encoding.
------------------------------------------------ 3
> python
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.uname()
uname_result(system='Windows', node='先知_MICAH', release='10', version='10.0.14393', machine='AMD64', processor='AMD64 Family 16 Model 5 Stepping 2, AuthenticAMD')
>>> platform.uname().node
'先知_MICAH'
|
|||
| msg11483 (view) | Author: Jeff Allen (jeff.allen) | Date: 2017-07-20.07:49:21 | |
I claim this is fixed at: https://hg.python.org/jython/rev/c3e2799ef812 >>> import os, platform >>> os.uname() ('Windows', '\xe5\x85\x88\xe7\x9f\xa5_MICAH', '8.1', '10.0.14393', 'AMD64') >>> platform.uname() ('Java', '\xe5\x85\x88\xe7\x9f\xa5_MICAH', '1.7.0_60', 'Java HotSpot(TM) 64-Bit Server VM, 24.60-b09, Oracle Corporation', 'AMD64', 'AMD64 Family 16 Model 5 Stepping 2, AuthenticAMD') >>> print platform.uname()[1].decode(sys.getfilesystemencoding()) 先知_MICAH |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2019-01-01 17:02:03 | jeff.allen | unlink | issue2726 superseder |
| 2018-12-31 11:58:15 | jeff.allen | link | issue2726 superseder |
| 2017-09-05 20:51:20 | zyasoft | set | status: pending -> closed |
| 2017-07-20 07:49:22 | jeff.allen | set | status: open -> pending messages: + msg11483 |
| 2017-07-18 20:57:17 | jeff.allen | set | messages: + msg11480 |
| 2017-07-18 07:47:03 | jeff.allen | create | |
Supported by Python Software Foundation,
Powered by Roundup