Issue2654

classification
Title: Imported modules allow access to submodules
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7
Milestone: Jython 2.7.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jeff.allen Nosy List: fwierzbicki, jamesmudd, jeff.allen, stefan.richthofer
Priority: high Keywords:

Created on 2018-02-08.16:55:32 by jamesmudd, last changed 2019-03-16.16:40:35 by jeff.allen.

Messages
msg11701 (view) Author: James Mudd (jamesmudd) Date: 2018-02-08.16:55:30
If you have a python module eg.

test_mod/
├── hello.py      <<<<< Contains only foo = 4
└── __init__.py   <<<<< Contains only _version__ = '1.1.1'

If you import this in cpython you will not have access to hello e.g.
>>> import test_mod
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__']
>>> test_mod.hello
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'hello'

Which is correct because you didn't import test_mode.hello and the __init__.py doesn't import it.

In Jython (master)

>>> import test_mod
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__'] <<<<< Looks good at this point
>>> test_mod.hello
<module 'test_mod.hello' from '/scratch/test_mod/hello.py'> <<<<< This is wrong shouldn't be able to see hello
>>> test_mod.hello.foo
4
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__', 'hello'] <<<<< And now its added  to the module

I think this happens because of a change made to fix #2455 which means if you lookup an attribute on a Pyhton module and don't find it try to import it. The bug described in that ticket is when you have a mixed Python/Java package. However i'm not sure it should have been fixed and now the behaviour is wrong for Python. To fix this I think the support for the mixed Python/Java package needs to be removed which I don't see as an issue.
msg11723 (view) Author: Jeff Allen (jeff.allen) Date: 2018-02-27.22:34:41
Oh great: automatic imports. This is better than plain old Python! But seriously ... I confirm the observation and agree this is incorrect behaviour.

PS issue2654> dir test_mod
...
-a----       27/02/2018     21:55            270 Bad.class
-a----       27/02/2018     21:50             69 Bad.java
-a----       27/02/2018     07:56             28 hello.py
-a----       27/02/2018     07:56             28 __init__.py

PS issue2654> jython
Jython 2.7.2a1+ (default:d74f8c2cd56f, Feb 24 2018, 17:18:53)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_151
Type "help", "copyright", "credits" or "license" for more information.
>>> import test_mod
Executed:  test_mod\__init__.py
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
>>> test_mod.Bad.x          ### This magical import we want (I think)
42
>>> dir(test_mod)
['Bad', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
>>> type(test_mod.hello)    ### This magical import is not Python :(
Executed:  test_mod\hello.py
<type 'module'>

It's not as bad as at first I thought. It would still have my vote to be fixed before 2.7.2 goes out.

The business of package/module import is difficult enough in plain Python. We add Java packages to the mix, which are syntactically similar and semantically different, so it is bound to be difficult. The first thing is to be clear what behaviour we want. The Jython Book has something useful to say here:
http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html#sys-path
The word "disastrous" should give us pause.

On the other hand, the Java semantics are a lot like Python 3 namespace packages (PEP 420). There the first __init__.py still defines a regular package exclusively at that location.
msg12330 (view) Author: Jeff Allen (jeff.allen) Date: 2019-01-20.23:37:08
The desired feature is that we can have Python and Java packages by the same name, in the same place. The history of this is quite amusing to dig up, although the serious worry is that while it seems reasonable to expect, it might not be feasible without unacceptable side-effects.

The feature first appears in 2000 in https://hg.python.org/jython/file/7d2145df96de/org/python/core/PyModule.java#l17 where PyModule gets a __findattr__ that looks up missing attributes as sub-modules. The way this works morphs soon after: impAttr is invented roughly as it is now and __findattr__ calls it indirectly via impHook() and builtin.__import__(), see: https://hg.python.org/jython/file/e65ab08ab34e/org/python/core/PyModule.java#l72. (How short and simple everything seems.) This is relased in Jython 2.0.

A lot later (2008) we "remove the magic that automatically imports a module": https://hg.python.org/jython/rev/6d306db1e625. __findattr__ is simplified. This is how it looks in Jython 2.5.

__findattr__ is superseded by __findattr_ex__ for reasons we needn't go into, but PyModule doesn't need to override it, so what's left of PyModule.__findattr__ is simply removed.

In 2010, Jim adds a custom __findattr_ex__  to PyModule that calls impAttr to "re-enable mixing Java and Python code in the same directory" in https://hg.python.org/jython/rev/51ec6962b521. This was issue #1653. It is released in Jython 2.5.2.

Then Philip removes __findattr_ex__ while fixing #1811 (https://hg.python.org/jython/rev/4c672dbbcdb2#l4.8), although it is not clear that it is necessary to the fix. We release it this way in 2.5.3 and 2.7.0.

Darjus puts it back in https://hg.python.org/jython/rev/ac01360b4252 to fix #2455, and this is released in Jython 2.7.1. That re-introduces the automagical import of Python modules that this issue is about.

So, now we find ourselves considering removal again, but clearly we've been there and it's more subtle than that. It almost seems we don't quite know what we want. I imagine what we want is:

1. Python 2 semantics apply to Python packages and modules defined by __init__.py files and modules in packages as .py files. (All applies equally to $py.class and .pyc equivalents, and inside archives.)

2. A compiled Java packages is treated like a Python module (which may be a Python package), except a __init__ is not necessary to mark a package, and classes in the Java package define attributes of the Python module and are not valid targets for import. (This applies equally to file systems inside archives.)

3. These rules apply even when the directory in question is both a Python package and a Java package co-located (on the open file system or in an archive).

I believe this is possible in principle, but delicate. There is probably a rule missing too, that excepts Java classes in a package from rule 2, if they are intended to implement Python modules. A Java class implementing a module should not appear to be an attribute until explicitly imported.

I think this means that the return from __import__() ought always to be a PyModule, whereas at the moment it can return a module, javapackage or even a type object. Those modules containing or defined by Java classes may have values of __path__ and other attributes that CPython would not produce, but would still be modules.

This may be too complicated to get in 2.7.2, but I don't see an acceptable compromise position at the moment other than the divergent automagical behaviour.
msg12332 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2019-01-21.19:04:04
Jeff, thanks for digging this up! It seems to me that such a development loops occur regularly in Jython, probably in other sufficiently complex projects too. A central question popping up is: Why did testing not prevent reintroduction of old bugs?
The answer is obvious: Fixing the issue had much higher priority than writing a test. Often enough a proper test is difficult to implement. I also found myself unable to allocate additional time to proper test-writing in the phase before 2.7.1 release. This is really a symptom of lacking resources.

Nevertheless we should reason about these lessons learned and consider ways to prevent this in the future. Maybe this story should somehow go into developer guide..., at least as a link?
I imagine it was some work to dig this up and it shouldn't be lost (i.e. de-facto invisible once this issue gets closed).
msg12334 (view) Author: Jeff Allen (jeff.allen) Date: 2019-01-22.23:19:37
I have begun to write notes in a sharable form, about some of the difficult aspects of Jython, including one about this: https://github.com/jeff5/inside-jython/blob/master/source/import.rst

I think a sufficient answer here, short of the "ideal" revision of modules and import, is to keep (something like) impAttr, but ensure it only finds Java classes.

Writing regression tests for import is difficult as one has to set up temporary folders on sys.path, and change to the interpreter is permanent. The tests we have from CPython test that things that should work, do work. Less often do they test that things that shouldn't happen (like magical imports) do not happen. The same is true of the tests for #2455.
msg12344 (view) Author: Jeff Allen (jeff.allen) Date: 2019-02-08.22:21:05
Oh the horror, the horror!

I have spent a couple of weeks in imp.java with a debugger, and I can safely say there is some weird stuff here. Did we know that every time a module imports a module, we enumerate all the packages in the JVM? I didn't.

I made some notes. https://inside-jython.readthedocs.io/en/latest/import.html#sequence-of-events-in-jython-2-7-1

I would really like to re-write this based on a modern Python 2 import.c but unlike the comparable (and beneficial) re-write of jython.java, the bits specific to Java packages are mysterious and fragile-looking.

I'm seriously contemplating leaving this as it is in 2.7.2.
msg12350 (view) Author: Jeff Allen (jeff.allen) Date: 2019-03-03.16:32:07
I may have this one cornered now. It has taken me a while to understand how this all works and the proper role of impAttr in particular (as opposed to how it gets used), and what other parts of the imp module/class actually do, and don't do.

We have tussled over whether or not to make a call to impAttr within PyModule.__findattr_ex__. When we do, we get the auto-import of Python sub-modules that James complains of, and is contrary to Python semantics. When we don't, we fail to import of Java packages and classes when the file system represents both. One solution seems to be reassuringly simple. Instead of impAttr at that point, we need just that part of impAttr that deals with Java packages and classes.

I still think there is a case for re-working all this fundamentally. But I'm hoping this simpler change will fix things for 2.7.2.
msg12352 (view) Author: Jeff Allen (jeff.allen) Date: 2019-03-07.23:24:52
I claim success at https://hg.python.org/jython/rev/22c19e5a77ad

I think it is debatable what the semantics should actually be for attribute access to Java sub-packages and classes. The test cases fix a particular choice from which regressions should now be detectable when they occur by accident.
msg12360 (view) Author: Jeff Allen (jeff.allen) Date: 2019-03-16.16:40:35
This has been rather long, but at least I learned a lot about import. I worked through the examples again and have tweaked my notes a bit.
https://inside-jython.readthedocs.io/en/latest/import.html#sequence-of-events-in-jython-2-7-2a1

In there I note some other anaomalous behaviour, but I think it has always beem there. Good enough for 2.7.2, I claim.
History
Date User Action Args
2019-03-16 16:40:35jeff.allensetstatus: pending -> closed
messages: + msg12360
2019-03-07 23:24:52jeff.allensetstatus: open -> pending
resolution: fixed
messages: + msg12352
2019-03-03 16:32:07jeff.allensetmessages: + msg12350
2019-02-08 22:21:05jeff.allensetassignee: jeff.allen
messages: + msg12344
2019-01-22 23:19:37jeff.allensetmessages: + msg12334
2019-01-21 19:04:04stefan.richthofersetnosy: + stefan.richthofer
messages: + msg12332
2019-01-20 23:37:08jeff.allensetmessages: + msg12330
2018-02-27 22:34:42jeff.allensetpriority: high
nosy: + jeff.allen, fwierzbicki
messages: + msg11723
2018-02-08 16:55:32jamesmuddcreate