Message9283

Author Arfrever
Recipients Arfrever, jeff.allen, zyasoft
Date 2014-12-31.20:33:00
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1420057981.08.0.712968993109.issue2239@psf.upfronthosting.co.za>
In-reply-to
Content
How will Jython handle filenames containing sequence of bytes not decodable to unicode?
(Only NULL byte and "/" are not allowed in filenames in filesystems native in GNU/Linux.)
CPython 2.7 keeps them as bytes even when unicode string is passed to os.listdir().
CPython >=3.1 decodes bytes to strings using surrogateescape error handler.

$ mkdir /tmp/some_dir
$ touch /tmp/some_dir/$'\x80'
$ python2.7 -c 'import os; print(os.listdir("/tmp/some_dir"))'
['\x80']
$ python2.7 -c 'import os; print(os.listdir(u"/tmp/some_dir"))'
['\x80']
$ python3.5 -c 'import os; print(os.listdir(b"/tmp/some_dir"))'
[b'\x80']
$ python3.5 -c 'import os; print(os.listdir("/tmp/some_dir"))'
['\udc80']
History
Date User Action Args
2014-12-31 20:33:01Arfreversetmessageid: <1420057981.08.0.712968993109.issue2239@psf.upfronthosting.co.za>
2014-12-31 20:33:01Arfreversetrecipients: + Arfrever, zyasoft, jeff.allen
2014-12-31 20:33:01Arfreverlinkissue2239 messages
2014-12-31 20:33:00Arfrevercreate