Issue1658
Created on 2010-09-26.21:23:44 by pekka.klarck, last changed 2015-02-26.06:29:57 by pekka.klarck.
Messages | |||
---|---|---|---|
msg6097 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2010-09-26.21:23:42 | |
Here's a simple example demonstrating the problem: import os for c in 254, 255, 256: f = unichr(c)+'.txt' open(f, 'w').close() print repr(f), 'exists', os.path.exists(f) os.stat(f) When the above code is run Linux, it reports all created files as existing and exits cleanly. On Windows you get this: D:\>jython os_stat_bug.py u'\xfe.txt' exists True u'\xff.txt' exists True u'\u0100.txt' exists False Traceback (most recent call last): File "os_stat_bug.py", line 6, in <module> os.stat(f) File "C:\jython2.5.1\Lib\os.py", line 478, in stat return stat_result.from_jnastat(_posix.stat(abs_path)) File "C:\jython2.5.1\Lib\os.py", line 103, in error raise OSError(err, strerror(err), asPyString(msg)) OSError: [Errno 2] No such file or directory: 'D:\\\u0100.txt' As the failing os.path.exists in the above code already illustrated, a bug in os.stat is pretty annoying because so many other methods depend on it. I originally noticed this problem when shutil.rmtree didn't work. The error occurs when the stat method calls _posix.stat. This _posix is returned by a factory and in my case it was WindowsPOSIX. I was able to make our projects acceptance tests pass with this workarounds: if sys.platform.startswith('java') and os.sep == '\\': os._posix = os.JavaPOSIX(os.PythonPOSIXHandler()) os._native_posix = False Could someone who knows Jython internals better comment is this workaround valid? Hopefully this is can be fixed in 2.5.2. |
|||
msg6098 (view) | Author: Philip Jenvey (pjenvey) | Date: 2010-09-26.23:04:38 | |
I rewrote the posix module for 2.5.2 and the stat function was overhauled. Can you try this on there? It may already be fixed |
|||
msg6099 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2010-09-26.23:09:51 | |
Forgot to mention earlier tested that the bug appears also with 2.5.2 beta 2. Philip, do you think the workaround I presented is valid? Even if this gets fixed in 2.5.2, we need to support people who have 2.5.1 installed. |
|||
msg6129 (view) | Author: Philip Jenvey (pjenvey) | Date: 2010-10-03.22:05:47 | |
This is stat's fault, open creates the file with the correct filename The underlying stat impl is Windows _stat64 via jnr-posix. The fix for this might be to use _wstat64 instead. Though CPython apparently uses a different API call for win32 stat: GetFileAttributesExW Another solution might be to support sys.getfilesystemencoding and encode the filename first, but the JVM doesn't even seem to support the 'mbcs' encoding?? (correct me if I'm wrong) Jython's os.listdir with a str arg also differs from what CPython returns for this file. I guess this should be expected since Jython lacks a sys.getfilesystemencoding() value on Windows We return: >>> os.listdir('.') ['\xfe.txt', '\u0100.txt', '\xff.txt'] >>> os.listdir(u'.') [u'\xfe.txt', u'\u0100.txt', u'\xff.txt'] CPython 2.5: >>> os.listdir('.') ['\xfe.txt', 'A.txt', '\xff.txt'] >>> os.listdir(u'.') [u'\xfe.txt', u'\u0100.txt', u'\xff.txt'] You get 'A.txt' from '\u100.txt'.encode('mbcs') ('mbcs' being sys.getfilesystemencoding()) |
|||
msg6150 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2010-10-06.22:49:06 | |
I don't have enough knowledge to comment what's the right way to fix this. If there's no better solution, using JavaPOSIX instead of WindowsPOSIX seems to work. Apparently the latter provide more functionality, but I think this bug is too severe to be left unfixed. os.listdir returning different bytes on Jyhton and than on CPython is also discussed in issue #1593. |
|||
msg6156 (view) | Author: Philip Jenvey (pjenvey) | Date: 2010-10-07.21:20:38 | |
So CPython has the 'mbcs' encoding as a generic name for the current Windows code page (CP_ACP) -- meaning mbcs could be one of many encodings depending on your locale. It also uses Windows system APIs for the encoding/decoding. I'm not sure why it works this way -- maybe it's so CPython doesn't have to formally map all the various Windows encodings (including some of the odd Windows specific ones) to real encodings. Or maybe some of those encodings aren't supported on all Windows platforms. The JVM's file.encoding property is derived from the current user's locale. The JVM maps the locale to one of its internal encodings. However it looks like it may fall back to UTF-8 in some cases. So the JVM's file.encoding property could potentially be our filesystemencoding value on Windows. Would it be 100% reliable though? And maybe we'd want to emulate the mbcs encoding for compatibility sake? |
|||
msg6252 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2010-11-17.11:11:19 | |
The same problem appears also with Jython 2.5.2rc2. Unfortunately my earlier workaround doesn't work anymore with this release, because `JavaPOSIX` and friends are not exposed in the `os` module anymore. Is there any change that the underlying bug is fixed before 2.5.2 final or should I try to find another workaround? |
|||
msg6780 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2012-02-13.07:38:52 | |
Based on my experimentation java.lang.System.getProperty('file.encoding') returns the correct encoding to use. I submitted separate issue #1839 about implementing sys.getfilesystemencoding() using it. |
|||
msg9358 (view) | Author: Jim Baker (zyasoft) | Date: 2015-01-08.04:50:42 | |
Still a problem on Windows, but not Linux, despite the fixes we have mode re Unicode paths in #2239 Likely the problem is due to the underlying C stat function being called in JNR Posix is mixing up the difference between Unicode and bytes. But we are now on Java 7. Although BasicFileAttributes doesn't give us stuff like inode and device on Unix-like systems (or even the more extended Posix attributes), we don't have them anyway with JNR. Might as well use BasicFileAttributes then when running on Windows. |
|||
msg9408 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2015-01-16.13:35:32 | |
This seems to be worse in 2.7 than in 2.5. I just reproduced this scenario with 2.7b4 preview on Windows 7: 1) Have a directory `xxx` with a subdirectory `日本語`. 2) Run `jython -c "import shutil; shutil.rmtree(u'xxx')"` 3) End result: Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\jython2.7b4-soft\Lib\shutil.py", line 252, in rmtree onerror(os.remove, fullname, sys.exc_info()) File "C:\jython2.7b4-soft\Lib\shutil.py", line 250, in rmtree os.remove(fullname) OSError: [Errno 21] Is a directory: u'xxx\\\u65e5\u672c\u8a9e' With Jython 2.5.3 the above works just fine. It works also with Python 2.7, but interesting it fails with WindowsError if I give the path as str and not unicode. With Jython versions str vs. unicode doesn't seem to have any difference. |
|||
msg9460 (view) | Author: Jim Baker (zyasoft) | Date: 2015-01-28.19:10:35 | |
Blocker for beta 4 |
|||
msg9474 (view) | Author: Jim Baker (zyasoft) | Date: 2015-02-02.20:29:35 | |
Fixed as of https://hg.python.org/jython/rev/e04fa277ce19 |
|||
msg9563 (view) | Author: Pekka Klärck (pekka.klarck) | Date: 2015-02-26.06:29:56 | |
FWIW, our acceptance tests that used to fail on 2.5 now pass with 2.7b4. Great work Jim and everyone else involved! |
History | |||
---|---|---|---|
Date | User | Action | Args |
2015-02-26 06:29:57 | pekka.klarck | set | messages: + msg9563 |
2015-02-09 23:31:17 | zyasoft | set | status: pending -> closed |
2015-02-02 20:29:35 | zyasoft | set | status: open -> pending resolution: accepted -> fixed messages: + msg9474 |
2015-01-28 19:10:35 | zyasoft | set | priority: high -> urgent messages: + msg9460 |
2015-01-16 13:35:32 | pekka.klarck | set | messages: + msg9408 |
2015-01-08 04:50:50 | zyasoft | set | resolution: accepted |
2015-01-08 04:50:42 | zyasoft | set | priority: high assignee: zyasoft messages: + msg9358 nosy: + zyasoft |
2013-03-01 00:22:18 | amak | set | nosy: + amak |
2013-02-26 17:36:22 | fwierzbicki | set | nosy: + fwierzbicki |
2012-02-13 07:38:52 | pekka.klarck | set | messages: + msg6780 |
2010-11-17 11:11:19 | pekka.klarck | set | messages: + msg6252 |
2010-10-07 21:20:39 | pjenvey | set | messages: + msg6156 |
2010-10-06 22:49:07 | pekka.klarck | set | messages: + msg6150 |
2010-10-03 22:05:47 | pjenvey | set | messages: + msg6129 |
2010-09-26 23:09:52 | pekka.klarck | set | messages: + msg6099 |
2010-09-26 23:04:38 | pjenvey | set | nosy:
+ pjenvey messages: + msg6098 |
2010-09-26 21:23:44 | pekka.klarck | create |
Supported by Python Software Foundation,
Powered by Roundup