Issue2654

classification
Title: Imported modules allow access to submodules
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7
Milestone: Jython 2.7.2
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: jeff.allen Nosy List: fwierzbicki, jamesmudd, jeff.allen, stefan.richthofer
Priority: high Keywords:

Created on 2018-02-08.16:55:32 by jamesmudd, last changed 2019-02-08.22:21:05 by jeff.allen.

Messages
msg11701 (view) Author: James Mudd (jamesmudd) Date: 2018-02-08.16:55:30
If you have a python module eg.

test_mod/
├── hello.py      <<<<< Contains only foo = 4
└── __init__.py   <<<<< Contains only _version__ = '1.1.1'

If you import this in cpython you will not have access to hello e.g.
>>> import test_mod
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__']
>>> test_mod.hello
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'hello'

Which is correct because you didn't import test_mode.hello and the __init__.py doesn't import it.

In Jython (master)

>>> import test_mod
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__'] <<<<< Looks good at this point
>>> test_mod.hello
<module 'test_mod.hello' from '/scratch/test_mod/hello.py'> <<<<< This is wrong shouldn't be able to see hello
>>> test_mod.hello.foo
4
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__', 'hello'] <<<<< And now its added  to the module

I think this happens because of a change made to fix #2455 which means if you lookup an attribute on a Pyhton module and don't find it try to import it. The bug described in that ticket is when you have a mixed Python/Java package. However i'm not sure it should have been fixed and now the behaviour is wrong for Python. To fix this I think the support for the mixed Python/Java package needs to be removed which I don't see as an issue.
msg11723 (view) Author: Jeff Allen (jeff.allen) Date: 2018-02-27.22:34:41
Oh great: automatic imports. This is better than plain old Python! But seriously ... I confirm the observation and agree this is incorrect behaviour.

PS issue2654> dir test_mod
...
-a----       27/02/2018     21:55            270 Bad.class
-a----       27/02/2018     21:50             69 Bad.java
-a----       27/02/2018     07:56             28 hello.py
-a----       27/02/2018     07:56             28 __init__.py

PS issue2654> jython
Jython 2.7.2a1+ (default:d74f8c2cd56f, Feb 24 2018, 17:18:53)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_151
Type "help", "copyright", "credits" or "license" for more information.
>>> import test_mod
Executed:  test_mod\__init__.py
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
>>> test_mod.Bad.x          ### This magical import we want (I think)
42
>>> dir(test_mod)
['Bad', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
>>> type(test_mod.hello)    ### This magical import is not Python :(
Executed:  test_mod\hello.py
<type 'module'>

It's not as bad as at first I thought. It would still have my vote to be fixed before 2.7.2 goes out.

The business of package/module import is difficult enough in plain Python. We add Java packages to the mix, which are syntactically similar and semantically different, so it is bound to be difficult. The first thing is to be clear what behaviour we want. The Jython Book has something useful to say here:
http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html#sys-path
The word "disastrous" should give us pause.

On the other hand, the Java semantics are a lot like Python 3 namespace packages (PEP 420). There the first __init__.py still defines a regular package exclusively at that location.
msg12330 (view) Author: Jeff Allen (jeff.allen) Date: 2019-01-20.23:37:08
The desired feature is that we can have Python and Java packages by the same name, in the same place. The history of this is quite amusing to dig up, although the serious worry is that while it seems reasonable to expect, it might not be feasible without unacceptable side-effects.

The feature first appears in 2000 in https://hg.python.org/jython/file/7d2145df96de/org/python/core/PyModule.java#l17 where PyModule gets a __findattr__ that looks up missing attributes as sub-modules. The way this works morphs soon after: impAttr is invented roughly as it is now and __findattr__ calls it indirectly via impHook() and builtin.__import__(), see: https://hg.python.org/jython/file/e65ab08ab34e/org/python/core/PyModule.java#l72. (How short and simple everything seems.) This is relased in Jython 2.0.

A lot later (2008) we "remove the magic that automatically imports a module": https://hg.python.org/jython/rev/6d306db1e625. __findattr__ is simplified. This is how it looks in Jython 2.5.

__findattr__ is superseded by __findattr_ex__ for reasons we needn't go into, but PyModule doesn't need to override it, so what's left of PyModule.__findattr__ is simply removed.

In 2010, Jim adds a custom __findattr_ex__  to PyModule that calls impAttr to "re-enable mixing Java and Python code in the same directory" in https://hg.python.org/jython/rev/51ec6962b521. This was issue #1653. It is released in Jython 2.5.2.

Then Philip removes __findattr_ex__ while fixing #1811 (https://hg.python.org/jython/rev/4c672dbbcdb2#l4.8), although it is not clear that it is necessary to the fix. We release it this way in 2.5.3 and 2.7.0.

Darjus puts it back in https://hg.python.org/jython/rev/ac01360b4252 to fix #2455, and this is released in Jython 2.7.1. That re-introduces the automagical import of Python modules that this issue is about.

So, now we find ourselves considering removal again, but clearly we've been there and it's more subtle than that. It almost seems we don't quite know what we want. I imagine what we want is:

1. Python 2 semantics apply to Python packages and modules defined by __init__.py files and modules in packages as .py files. (All applies equally to $py.class and .pyc equivalents, and inside archives.)

2. A compiled Java packages is treated like a Python module (which may be a Python package), except a __init__ is not necessary to mark a package, and classes in the Java package define attributes of the Python module and are not valid targets for import. (This applies equally to file systems inside archives.)

3. These rules apply even when the directory in question is both a Python package and a Java package co-located (on the open file system or in an archive).

I believe this is possible in principle, but delicate. There is probably a rule missing too, that excepts Java classes in a package from rule 2, if they are intended to implement Python modules. A Java class implementing a module should not appear to be an attribute until explicitly imported.

I think this means that the return from __import__() ought always to be a PyModule, whereas at the moment it can return a module, javapackage or even a type object. Those modules containing or defined by Java classes may have values of __path__ and other attributes that CPython would not produce, but would still be modules.

This may be too complicated to get in 2.7.2, but I don't see an acceptable compromise position at the moment other than the divergent automagical behaviour.
msg12332 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2019-01-21.19:04:04
Jeff, thanks for digging this up! It seems to me that such a development loops occur regularly in Jython, probably in other sufficiently complex projects too. A central question popping up is: Why did testing not prevent reintroduction of old bugs?
The answer is obvious: Fixing the issue had much higher priority than writing a test. Often enough a proper test is difficult to implement. I also found myself unable to allocate additional time to proper test-writing in the phase before 2.7.1 release. This is really a symptom of lacking resources.

Nevertheless we should reason about these lessons learned and consider ways to prevent this in the future. Maybe this story should somehow go into developer guide..., at least as a link?
I imagine it was some work to dig this up and it shouldn't be lost (i.e. de-facto invisible once this issue gets closed).
msg12334 (view) Author: Jeff Allen (jeff.allen) Date: 2019-01-22.23:19:37
I have begun to write notes in a sharable form, about some of the difficult aspects of Jython, including one about this: https://github.com/jeff5/inside-jython/blob/master/source/import.rst

I think a sufficient answer here, short of the "ideal" revision of modules and import, is to keep (something like) impAttr, but ensure it only finds Java classes.

Writing regression tests for import is difficult as one has to set up temporary folders on sys.path, and change to the interpreter is permanent. The tests we have from CPython test that things that should work, do work. Less often do they test that things that shouldn't happen (like magical imports) do not happen. The same is true of the tests for #2455.
msg12344 (view) Author: Jeff Allen (jeff.allen) Date: 2019-02-08.22:21:05
Oh the horror, the horror!

I have spent a couple of weeks in imp.java with a debugger, and I can safely say there is some weird stuff here. Did we know that every time a module imports a module, we enumerate all the packages in the JVM? I didn't.

I made some notes. https://inside-jython.readthedocs.io/en/latest/import.html#sequence-of-events-in-jython-2-7-1

I would really like to re-write this based on a modern Python 2 import.c but unlike the comparable (and beneficial) re-write of jython.java, the bits specific to Java packages are mysterious and fragile-looking.

I'm seriously contemplating leaving this as it is in 2.7.2.
History
Date User Action Args
2019-02-08 22:21:05jeff.allensetassignee: jeff.allen
messages: + msg12344
2019-01-22 23:19:37jeff.allensetmessages: + msg12334
2019-01-21 19:04:04stefan.richthofersetnosy: + stefan.richthofer
messages: + msg12332
2019-01-20 23:37:08jeff.allensetmessages: + msg12330
2018-02-27 22:34:42jeff.allensetpriority: high
nosy: + jeff.allen, fwierzbicki
messages: + msg11723
2018-02-08 16:55:32jamesmuddcreate