Issue2654

classification
Title: Imported modules allow access to submodules
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7
Milestone: Jython 2.7.2
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: fwierzbicki, jamesmudd, jeff.allen
Priority: high Keywords:

Created on 2018-02-08.16:55:32 by jamesmudd, last changed 2019-01-20.23:37:08 by jeff.allen.

Messages
msg11701 (view) Author: James Mudd (jamesmudd) Date: 2018-02-08.16:55:30
If you have a python module eg.

test_mod/
├── hello.py      <<<<< Contains only foo = 4
└── __init__.py   <<<<< Contains only _version__ = '1.1.1'

If you import this in cpython you will not have access to hello e.g.
>>> import test_mod
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__']
>>> test_mod.hello
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'hello'

Which is correct because you didn't import test_mode.hello and the __init__.py doesn't import it.

In Jython (master)

>>> import test_mod
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__'] <<<<< Looks good at this point
>>> test_mod.hello
<module 'test_mod.hello' from '/scratch/test_mod/hello.py'> <<<<< This is wrong shouldn't be able to see hello
>>> test_mod.hello.foo
4
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__', 'hello'] <<<<< And now its added  to the module

I think this happens because of a change made to fix #2455 which means if you lookup an attribute on a Pyhton module and don't find it try to import it. The bug described in that ticket is when you have a mixed Python/Java package. However i'm not sure it should have been fixed and now the behaviour is wrong for Python. To fix this I think the support for the mixed Python/Java package needs to be removed which I don't see as an issue.
msg11723 (view) Author: Jeff Allen (jeff.allen) Date: 2018-02-27.22:34:41
Oh great: automatic imports. This is better than plain old Python! But seriously ... I confirm the observation and agree this is incorrect behaviour.

PS issue2654> dir test_mod
...
-a----       27/02/2018     21:55            270 Bad.class
-a----       27/02/2018     21:50             69 Bad.java
-a----       27/02/2018     07:56             28 hello.py
-a----       27/02/2018     07:56             28 __init__.py

PS issue2654> jython
Jython 2.7.2a1+ (default:d74f8c2cd56f, Feb 24 2018, 17:18:53)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_151
Type "help", "copyright", "credits" or "license" for more information.
>>> import test_mod
Executed:  test_mod\__init__.py
>>> dir(test_mod)
['__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
>>> test_mod.Bad.x          ### This magical import we want (I think)
42
>>> dir(test_mod)
['Bad', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '__path__']
>>> type(test_mod.hello)    ### This magical import is not Python :(
Executed:  test_mod\hello.py
<type 'module'>

It's not as bad as at first I thought. It would still have my vote to be fixed before 2.7.2 goes out.

The business of package/module import is difficult enough in plain Python. We add Java packages to the mix, which are syntactically similar and semantically different, so it is bound to be difficult. The first thing is to be clear what behaviour we want. The Jython Book has something useful to say here:
http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html#sys-path
The word "disastrous" should give us pause.

On the other hand, the Java semantics are a lot like Python 3 namespace packages (PEP 420). There the first __init__.py still defines a regular package exclusively at that location.
msg12330 (view) Author: Jeff Allen (jeff.allen) Date: 2019-01-20.23:37:08
The desired feature is that we can have Python and Java packages by the same name, in the same place. The history of this is quite amusing to dig up, although the serious worry is that while it seems reasonable to expect, it might not be feasible without unacceptable side-effects.

The feature first appears in 2000 in https://hg.python.org/jython/file/7d2145df96de/org/python/core/PyModule.java#l17 where PyModule gets a __findattr__ that looks up missing attributes as sub-modules. The way this works morphs soon after: impAttr is invented roughly as it is now and __findattr__ calls it indirectly via impHook() and builtin.__import__(), see: https://hg.python.org/jython/file/e65ab08ab34e/org/python/core/PyModule.java#l72. (How short and simple everything seems.) This is relased in Jython 2.0.

A lot later (2008) we "remove the magic that automatically imports a module": https://hg.python.org/jython/rev/6d306db1e625. __findattr__ is simplified. This is how it looks in Jython 2.5.

__findattr__ is superseded by __findattr_ex__ for reasons we needn't go into, but PyModule doesn't need to override it, so what's left of PyModule.__findattr__ is simply removed.

In 2010, Jim adds a custom __findattr_ex__  to PyModule that calls impAttr to "re-enable mixing Java and Python code in the same directory" in https://hg.python.org/jython/rev/51ec6962b521. This was issue #1653. It is released in Jython 2.5.2.

Then Philip removes __findattr_ex__ while fixing #1811 (https://hg.python.org/jython/rev/4c672dbbcdb2#l4.8), although it is not clear that it is necessary to the fix. We release it this way in 2.5.3 and 2.7.0.

Darjus puts it back in https://hg.python.org/jython/rev/ac01360b4252 to fix #2455, and this is released in Jython 2.7.1. That re-introduces the automagical import of Python modules that this issue is about.

So, now we find ourselves considering removal again, but clearly we've been there and it's more subtle than that. It almost seems we don't quite know what we want. I imagine what we want is:

1. Python 2 semantics apply to Python packages and modules defined by __init__.py files and modules in packages as .py files. (All applies equally to $py.class and .pyc equivalents, and inside archives.)

2. A compiled Java packages is treated like a Python module (which may be a Python package), except a __init__ is not necessary to mark a package, and classes in the Java package define attributes of the Python module and are not valid targets for import. (This applies equally to file systems inside archives.)

3. These rules apply even when the directory in question is both a Python package and a Java package co-located (on the open file system or in an archive).

I believe this is possible in principle, but delicate. There is probably a rule missing too, that excepts Java classes in a package from rule 2, if they are intended to implement Python modules. A Java class implementing a module should not appear to be an attribute until explicitly imported.

I think this means that the return from __import__() ought always to be a PyModule, whereas at the moment it can return a module, javapackage or even a type object. Those modules containing or defined by Java classes may have values of __path__ and other attributes that CPython would not produce, but would still be modules.

This may be too complicated to get in 2.7.2, but I don't see an acceptable compromise position at the moment other than the divergent automagical behaviour.
History
Date User Action Args
2019-01-20 23:37:08jeff.allensetmessages: + msg12330
2018-02-27 22:34:42jeff.allensetpriority: high
nosy: + jeff.allen, fwierzbicki
messages: + msg11723
2018-02-08 16:55:32jamesmuddcreate