Issue1648449

classification
Title: os.path.normcase broken in Windows
Type: Severity: normal
Components: Library Versions:
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: cgroves, hsk0, pekka.klarck, pjenvey
Priority: normal Keywords:

Created on 2007-01-31.05:15:26 by cgroves, last changed 2008-02-24.04:17:32 by pjenvey.

Messages
msg1404 (view) Author: Charlie Groves (cgroves) Date: 2007-01-31.05:15:26
From laupke's bug #1534547

C:\>python
Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import os.path
>>> os.path.normcase('C:\\Temp')
'c:\\temp'
>>>

C:\>jython
Jython 2.2a2952 on java1.5.0_06 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>> import os.path
>>> os.path.normcase('C:\\Temp')
'C:\\Temp'
>>>
msg1405 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-01-31.18:36:27
Charles's comment in the original bug report:

"""I'm not sure how to detect if a filesystem is case sensitive in java, so I opened a new bug(#1648449) for normpath."""

If that's the case I'd recommend following approach.

1) First check how CPython does this. If their solution is possible also in Jython take it into use.

2) I think it would be pretty ok if normcase would work correctly in all major platforms (Posix, Mac, Windows) where the correct behaviour is known. 

3) For other platforms I see following two possibilities.

3.1) Just leave the path unchanged and document this behaviour.

3.2) When installing Jython or running it for the first time test the casesensitivity by creating a file like 'JYTHONTEST' into system temp directory and trying to read it with name 'jythontest'. If reading succeeds set some property indicating that platform is caseinsensitive to true and if reading fails (or there's any exception anywhere) set it to false.
msg1406 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-02-17.01:17:07
Following only slightly hackish implementation seems to fix this issue at least based on manual tests on Windows. I'll submit this as a part of a patch for os.path.abspath bug http://jython.org/bugs/1661700 unless separate patches are somehow better. Before that I anyway need to get some automated tests done first. 


_CASE_INSENSITIVE = None

def _case_insensitive_system():
    global _CASE_INSENSITIVE
    if _CASE_INSENSITIVE is None:
        path = abspath(os.curdir)
        _CASE_INSENSITIVE = samefile(path.lower(), path.upper())
    return _CASE_INSENSITIVE

def _tostr(s, method):
    if isinstance(s, basestring):
        return s
    import org
    raise TypeError, "%s() argument must be a str or unicode object, not %s" % (
                method, org.python.core.Py.safeRepr(s))
    
def _tofile(s, method):
    return File(_tostr(s, method))

def normcase(path):
    path = _tofile(path, "normcase").getPath() 
    if _case_insensitive_system():
        path = path.lower()
    return path
msg1407 (view) Author: howard kapustein (hsk0) Date: 2007-02-18.11:40:06
Problem is javaos.py always unconditionally loads javapath
Line 31: import javapath as path
This despite the comment at the top
Line 6: - os.path is one of the modules posixpath, ntpath, macpath, or dospath
To 'fix' aka to make os.path act like the native underlying OS-isms, rather than 'java-isms', replace the unconditional import statement with a conditional one:

...
import java
from java.io import File
from UserDict import UserDict

# Load the right flavor of os.path
_osname = None
_jythonospath = java.lang.System.getProperty('jython.os.path')
if _jythonospath:
    _jythonospath =  _jythonospath.lower()
    if _jythonospath in ['java', 'dos', 'mac', 'nt', 'posix']:
        _osname = _jythonospath
if _osname == None:
    _osname = java.lang.System.getProperty('os.name').lower()
if _osname == 'nt' or  _osname.startswith('windows'):
    import ntpath as path
elif _osname.startswith('mac') and not _osname.endswith('x'):
    import macpath as path
elif _osname == 'dos':
    import dospath as path
elif (_osname == 'posix') or ((File.pathSeparator == ':') and ('vms' not in _osname) and (not _osname.startswith('mac') or _osname.endswith('x'))):
    import posixpath as path
else:
    import javapath as path

class stat_result:
...


This auto-detects the OS unless explicitly overridden. The hunt sequence:
IF System.property jython.os.path = 'java', 'dos', 'mac', 'nt' or 'posix'
    use it
ELSE
    osname = java.lang.System.getProperty('os.name').lower()
    IF osname startswith 'windows'
        import ntpath as path
    ELSEIF osname startswith 'mac'
        import macpath as path
    ELSEIF osname startswith 'dos'
        import dospath as path
    ELSEIF pathsep == ':' and not VMS and (osname != Mac(classic) OR osname endswith 'x')
        import posixpath as path
    ELSE
        import javapath as path
    ENDIF
ENDIF

That autodetect for Unix looks funky, but it's right.
If the path separator is not ":", it's not Unix
If it's VMS, it's not Unix
If it's not MacOS OR it ends with 'X', it's Unix

That last part's a head bender at first.
MacOS falls thru.
MacOSX gets trapped as Unix
AIX, HP-UX, Linux and Irix are trapped as Unix
Solaris and SunOS are trapped as Unix

That last one because Solaris is not Mac; If you've got a ":" path separator, you're probably Unix, MacOS and VMS being the (known) exceptions

Voila.

Run with -v and you can see what gets loaded is what you'd expect.


And for those wondering if ntpath or posixpath is the right answer instead of javapath, yes, it is. It has to be. Java doesn't hide all underlying file system details; new File("C:\\foo.bar") is what's required to work on Windows and so forth. Java only generalized the API function calls, not the 'data' like filenames or pathing and so forth, so javapath is interesting but, ultimately, not the right answer consistent with other JVM behavior (at least, not for nt, mac, posix and dos).
msg1408 (view) Author: howard kapustein (hsk0) Date: 2007-02-18.11:44:57
NOTE: My posted fix corrects the broader problem, javapath is always used instead of the appropriate os-specific os.path implementation. This solves more than just normcase(); isabs(), ismount(), split() and so forth, not to mention ntpath has functions not even defined in javapath (e.g. splitunc).
msg1409 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-02-18.23:04:03
First of all I have to say I'm such a newbie with Jython internals that what I write here may not be 100% true. Anyway, I've understood that javapath acts as any other platform specific os.path implementation -- in Jython's case the platform is not the actual OS but the JVM. I believe the line in javaos that says "os.path is one of the modules posixpath, ntpath, macpath, or dospath" is just copied from CPython version and not updated accordingly. 

The problem with Jython is that even though the JVM is the platform people using Jython expect it works correctly (i.e. similarly as CPython) in any particular OS. The idea to use OS specific path modules is of course one possibility to solve this problem but unfortunately using CPython modules without modifications is probably not an option. CPython modules like posixpath and ntpath are heavily dependent on os.stat which, due to JVM limitations, only sets file size and mtime correctly in Jython. Thus the current approach is probably the simplest because java.io.File provides most of the functionality needed to implement CPython compatible os.path. There are cases where Java and CPython implementations differ (e.g. os.path.abspath normalizes the returned path but java.io.File.getAbsolutePath doesn't) but in my opinion they can be handled inside javapath.
msg1410 (view) Author: howard kapustein (hsk0) Date: 2007-02-20.00:28:54
>javapath acts as any other platform specific os.path implementation

Yes, I thought about that. The problem is the JVM is both "a tool" _and_ "a platform". Depending on your particular definition and viewpoint, javapath is / not the right answer.

I can certainly see the rationale to say "Jython runs on the Java Platform" and thus os.path should always be javapath.

However that creates several problems, as it's inconsistent with the view "Jython runs on [Windows/Unix/...]".

Fundamentally, the JVM itself is skitsofrenic(sp?).

In this case, I think it's more correct to make os.path the same as CPython does - the 'native' flavor (ntpath, posixpath, etc) with an option to use a 'java-centric' flavor as an alternative.

This seems historically consistent (if you can use such a phrase regarding this), witness file system notation (values for that String parameter to the File constructor). If the JVM were truly its own platform, you'd use one notation regardless of the underlying system (see Cygwin for an example of how to ignore the native notation in favor of your own dogmatically consistent world view; right or wrong is irrelevant, at least it picks a single world view and sticks to it).

More significantly (IMO), using the 'native' *path.py is in line with recent trends in Sun's JVM (1.4 and, esp, 1.5+). Witness the XP and GTK visual themes for Swing, the getFreeDiskSpace API (finally) in 1.6 (which one could argue merely continues the java.io not-hiding-the-underlying-platform pattern since Java 1.0) and so forth.

And philosophy aside, here's a more concrete example: os.path.splitunc()
Should that exist in Jython's os.path module?

CPython says yes, it exists, on Windows (and nowhere else).
What would a Jython user expect?

Is a Jython developer building a Windows application using Java?
Or is a Jython developer building a Java application that happens to run on Windows?

The true underlying crux of the matter.

IMO there are far more people interested in writing Windows apps that happens to run in the JVM than vice versa.

And CPython fundamentally takes this approach.
One of the things beloved of Python: you *can* write portable code, but you don't *have* to.

This is one of Python's huge *strengths*, and Java's weakness.
Want to access the registry on Windows? Curses on Unix? ... Go ahead. Python not only allows it, Python will help you.
You're not required to write platform-specific code, but you're also not only not-precluded, you're outright aided and abetted. If you so choose.

If you so choose.

That's one hugely compelling benefit I see in Python vs. Java.

Python gives you the power to choose, what best fits YOUR needs.
Java dictates and limits your choices. You're not allowed to write non-portable code, because that would be...bad. So not only does Java avoid helping you, it actively hinders you.

[Witness the GetFreeDiskSpace scenario...]

It's only in recent years that Java's started to loosen up and (start to) embrace The Python Way <g>



This is a long winded way of saying yes, it's reasonable for some people to want to code to the os.path=javapath model, but it's more useful and expected for os.path=nativepath as a default while not precluding those who want the javapath model (hence the reason I added the System.getProperty() check).

And it's not only more useful and expected, but os.path=nativepath is also the right answer because it's consistent with The Python Way -- portability is nice, but not at the cost of freedom and functionality. If you wanted your functionality constrained and dictated, you can always use Java :->
msg1411 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-02-20.11:29:11
I agree dividing javapath into separate modules would be The Right Thing but in the end that's pretty much an implementation detail and doesn't really matter for os.path users assuming that everything works.

Because dividing javapath would require some extra work I'm not personally planning to do it (at least at the moment) as I'm more interested in getting actual bugs in javapath fixed. I'm also planning to write comprehensive unit tests so that further optimization is possible afterwards.

This of course doesn't prevent others from making bigger refactoring for javapath even now. If you think dividing is something that must be done the best way to actually get it done is creating a patch. Setting up Jython development environment is pretty simple using guide at [1] -- I just set it up last weekend myself.

[1] http://wiki.python.org/jython/JythonDeveloperGuide

msg1412 (view) Author: howard kapustein (hsk0) Date: 2007-02-20.13:57:38
>I agree dividing javapath into separate modules would be The Right Thing
I didn't say that, or at least didn't mean to, though it did cross my mind.

Problem: Today, import ntpath does not work, so even if you wanted access to the various non-javapath goodies, you can't do it. And to be honest, I'm not sure why -- import posixpath and so forth work fine in CPython 2.4.1; need to try it in 2.2 but methinks this is a bug in Jython. Will explore.

>but in the end that's pretty much an implementation detail and doesn't
>really matter for os.path users assuming that everything works.
And that's my point. javapath provides a "Java-centric" view of os.path.
Is that what people expect?

Again, this comes back to the fundamental question:

  Are people using Java to write Windows/AIX/Mac/... applications,
  or are people using Windows/... to write Java applications?

os.path=javapath is right for the latter, and wrong for the former.

I am in the former camp. I write code that routinely has to run on AIX, HP-UX, Solaris, Linux and Windows, so portability is a good thing, but I also prefer Python's model over Java's -- I like having access to platform-specific details, when I so choose. I've dealt with os.name and sys.platform and the rest often enough, when needed, and find the Python portability model to be simple and effective - and equally simple and unobtrusive when I *don't* care about writing portable code.


Assuming import ntpath and import javapath not working today are simply bugs to fix and not a feature, 'twould appear best to make os.path=nativepath:
1. Default behavior is comparable to CPython
2. Default behavior is what people using-Java-to-write-Win/etc-apps expect and prefer
3. People using-Win/etc-to-write-Java-apps can get _their_ desired behavior by 'import javapath'
4. This also happens to work much like IronPython; IP provides its own libraries (.NET Framework), but if you point it to the CPython libraries *you get CPython library behavior*. IronPython actually provides *better* Python compatibility than Jython does using os.path=javapath.

#4 particularly bothers me.
There is no 'spec' for Python; the closest thing we have is CPython's documentation and, to a lesser degree, CPython's implementation (including source code comments). IronPython does it right, IMO - a module with the same name as CPython should act like CPython (as much as possible). Would you find it bothersome or troubling if IronPython provide the string or _winreg modules with different behaviors and footprints than CPython's? What if IronPython provided a module called "re" that supported regular expressions, but not the same notation as CPython's? Would that bother you?


Jython should be compatible with CPython as much as possible.
Provide *extensions*, yes, excellent idea, but not gratuitous incompatibilities.

I see no problem - quite the opposite - with Jython provide *additional* modules, so whether it's called javapath or jython.os.path or whatever is, IMO, a *good* idea.

But Jython should not provide the same symbols as CPython with different behavior.


>If you think dividing is something that must be done
I don't. I have no qualms with javapath as-is, actually.
Well, it'd be nicer if it was _richer_, actually, but no, I see no need to change or splinter javapath.
Just don't call it os.path.

>Setting up Jython development environment is pretty simple using guide at [1]
Thanks for the tip, I hadn't seen that but yes it doesn't look too painful.

The os.path fix is quite easy actually, since it's just 1 source file.
I'll submit a patch for it as per [1]
msg1413 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-05-11.22:40:53
I have a patch for this but it depends on http://jython.org/patches/1716709 so I'll wait untill that one is applied before submitting it.

The patch is implemented so that it does

    path = abspath(os.curdir)
    _CASE_INSENSITIVE = samefile(path.lower(), path.upper())

and normcase returns path in lowercase if _CASE_INSENSITIVE is true. There doesn't seem to be any direct way to ask are paths case-insensitive from the JVM so this approach should be ok.
msg1414 (view) Author: Pekka Klärck (pekka.klarck) Date: 2007-05-14.22:40:03
Patch is up at http://jython.org/patches/1718975
msg3043 (view) Author: Philip Jenvey (pjenvey) Date: 2008-02-24.04:17:31
fixed in r4171: we use the platform dependent path module (ntpath on 
Windows)
History
Date User Action Args
2008-02-24 04:17:32pjenveysetstatus: open -> closed
nosy: + pjenvey
resolution: fixed
messages: + msg3043
components: + Library, - None
2007-01-31 05:15:26cgrovescreate