Issue2846

classification
Title: Main module __name __ is not "__main__" under Java Scripting API
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7.1
Milestone: Jython 2.7.2
process
Status: pending Resolution: fixed
Dependencies: Superseder:
Assigned To: jeff.allen Nosy List: alexgobbo, jeff.allen, zyasoft
Priority: normal Keywords:

Created on 2019-12-11.13:18:40 by alexgobbo, last changed 2020-01-30.08:16:46 by jeff.allen.

Messages
msg12822 (view) Author: A. Gobbo (alexgobbo) Date: 2019-12-11.13:18:39
Both in Jython 2.7.1 ands 2.7.2b2 the name of the top-level scope is wrong when running Jython under Java Scripting API.

If you execute:
   org.python.util.PythonInterpreter interp = new org.python.util.PythonInterpreter();
   interp.exec("print __name__");

You get the familiar:
   __main__

However, executing: 
   javax.script.ScriptEngine engine = new javax.script.ScriptEngineManager().getEngineByName("python");
   engine.eval("print __name__");
You get:
   __builtin__

This creates of course a lot of problems. 
Many people do local workarounds like in their code:
  if __name__ in ['__builtin__', '__main__']:

I make a more global workaround to cope with code within libraries checking __name__. 
Just after initialising the engine I do:
   engine.put("__name__", "__main__");
   engine.eval("import sys");
   engine.eval("sys.modules['__main__']=sys.modules['__builtin__']"); 

This solve some issues, but not all. 
In particularly it is a show stopper for unittest, which does not work at all under Java Scripting API.


Note: This is an old bug, which has been reported in other places, but I didn't find it in bugs.jython:
   https://github.com/scijava/scripting-jython/issues/9
   https://forum.image.sc/t/jython-isssue-with-if---name-----main--/5544
msg12830 (view) Author: A. Gobbo (alexgobbo) Date: 2019-12-13.08:08:15
This seems to be deliberate. In the constructor of PythonInterpreter PyModule("__main__"...) is only created if !useThreadLocalState. This parameter set to true only by /jsr223/PyScriptEngine.java
msg12832 (view) Author: Jeff Allen (jeff.allen) Date: 2019-12-13.23:25:56
Ah, I see. It's not that we set __name__ to this surprising value. Rather, __name__ is resolved by look-up in locals, globals and builtins, and found in the last.

The difficulty is to be sure of the intent of the design. It looks like ScriptEngine.eval() is the rough equivalent of the exec statement with a user-supplied dictionary (CPython 2.7.16):

>>> exec "print globals().keys()" in {}
['__builtins__']

Jython 2.7.2b3:

>>> from javax.script import ScriptEngineManager
>>> engine = ScriptEngineManager().getEngineByName("python")
>>> engine.eval("print globals().keys()")
['__builtins__']

You're evidently expecting it to be more like executing a file using the python command. Hard to say who's right.
msg12872 (view) Author: Jeff Allen (jeff.allen) Date: 2019-12-22.07:58:35
Having poked around a bit, it looks harmless to meet these expectations, but I wish I understood the thread-dependent logic here and (what seems to be) a deliberate choice not to create a module.
msg12934 (view) Author: Jeff Allen (jeff.allen) Date: 2020-01-26.20:57:26
Code archeology shows that the thread-local state is a response to #1426. This all stems from a desire to avoid re-creating the ScriptEngine, but instead serve multiple (concurrent?) purposes by the client code keeping a context for each. This suggests the context should hold all the interpreter state.

It's a complicated discussion.

A context is not a thread, but the solution seems to be that we swap local variables (or are they global?) according to the current *thread*. I don't see how that achieves the aim but seemed to satisfy those involved at the time.

It's not the time to re-work it. So I guess I should avoid changing this too much. In particular, restoring a __main__ module might be going too far.

I'll spend some time tracing this through.
msg12942 (view) Author: Jeff Allen (jeff.allen) Date: 2020-01-28.22:38:34
At first I was surprised by:

>>> eval("__name__")
'__main__'
>>> eval("__name__",{})
'__builtin__'

but of course in the first eval() the global dictionary is inherited from the REPL, and so contains a binding for __name__, while in the second case the global dictionary is empty, and __name__ is found in the __builtins__ module. There may be a clue here for how our JSR-223 implementation should behave.

To judge by org.python.jsr223.ScriptEngineTest.testThreadLocalBindings() and the discussion on #1426, the use case is:
1. Get a "Python" engine e from the manager.
2. Optionally use e with one or more calls e.eval(script).
3. Get an empty Bindings object b (essentially an empty dictionary) and optionally populate it.
4. Call the engine one or more calls e.eval(script, b).

b acts as the module namespace (globals) for all eval() calls that mention it. You can use as many distinct name spaces as you like with one engine. The engine has one PySystemState, with its sys.modules list, but the default context, and each b, contains a distinct module state.

These states might be used from different threads, but they don't have to be. I still don't quite get the idea behind namespace juggling being done using a ThreadLocal, why a change of Bindings isn't enough. It's also bugging me that the calls involved are setLocals() and getLocals(), while it is clearly interp.globals that they manipulate (sometimes).


I think it reasonable that the calls at step 2 should find an environment like the REPL, with __name__ set, while the calls at step 4 should see an empty dictionary, by analogy with eval(). This ought o be possible, but I'm finding the thread-based juggling obscures the logic.
msg12944 (view) Author: Jeff Allen (jeff.allen) Date: 2020-01-30.08:16:45
If I perform the module creation (on engine creation) unconditionally, I get a behaviour quite similar to Python script execution from the command-line and standard eval in the REPL:

>>> from javax.script import *
>>> engine = ScriptEngineManager().getEngineByName("python")
>>> c = SimpleScriptContext()
>>> engine.eval("print globals().keys()")
['__builtins__', '__doc__', '__name__', '__package__']
>>> engine.eval("print __name__")
__main__

So __name__ == "__main__" and a module with the engine's name space is added to sys.modules.

This does not otherwise interfere with the thread-bound globals/locals, so although I don't understand the use case for it, I feel this can't have broken it.

I *do* understand the use case for swapping the name space, when a Bindings object (or a ScriptContext) is given explicitly. These behave (as before) parallel with exec/eval when explicitly given a dictionary:

>>> engine.eval("print globals().keys()", c)
['__builtins__']
>>> engine.eval("print __name__", c)
__builtin__

The isolation of name spaces supports re-use of the engine, but there is no security value in it. With an explicit Bindings or ScriptContext, the namespace of the engine is reachable via sys.modules:

>>> engine.eval("import sys; print sys.modules['__main__'].__dict__.keys()", c)
['__builtins__', '__doc__', '__name__', '__package__']

Now in the code base at https://hg.python.org/jython/rev/58d45a33c32f .
History
Date User Action Args
2020-01-30 08:16:47jeff.allensetstatus: open -> pending
resolution: accepted -> fixed
messages: + msg12944
title: Wrong name of the top-level scope when running Jython under Java Scripting API -> Main module __name __ is not "__main__" under Java Scripting API
2020-01-28 22:38:34jeff.allensetassignee: jeff.allen
messages: + msg12942
2020-01-26 20:57:26jeff.allensetmessages: + msg12934
2019-12-22 07:58:35jeff.allensetmessages: + msg12872
2019-12-13 23:25:56jeff.allensetnosy: + jeff.allen, zyasoft
messages: + msg12832
2019-12-13 08:08:15alexgobbosetmessages: + msg12830
2019-12-13 07:58:08jeff.allensetpriority: normal
resolution: accepted
milestone: Jython 2.7.2
2019-12-11 13:18:40alexgobbocreate