Issue2686

classification
Title: Re-align the logic of util.jython.run with CPython 2.7 equivalent
Type: behaviour Severity: normal
Components: Core Versions: Jython 2.7
Milestone: Jython 2.7.2
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jeff.allen Nosy List: jeff.allen, zyasoft
Priority: normal Keywords: console

Created on 2018-05-17.19:50:41 by jeff.allen, last changed 2019-01-02.09:12:26 by jeff.allen.

Messages
msg11991 (view) Author: Jeff Allen (jeff.allen) Date: 2018-05-17.19:50:40
The logic of Jython's main run() program has become quite tortured, especially regarding the treatment of interactive mode and when to install a line-editing console.

We have a reliable indication (from the launcher) whether out standard input is in fact an interactive console. However, we may not be making full use of it. It woould be good not to depend on isatty here as that depends on reflective access to private data, and as if that were not enough, we are promised it will break in a future version of Java.

When a tty stream is given as an argument ("jython /dev/stdin" or "jython CON"), so that isatty returns true, run() adapts its method of reading to the result of isatty(). However, the behaviour is still not console friendly, since it attempts to read 8192 bytes at once and seems not to recognise line ends.

The logic of CPython's main.c, which has the same purpose as Jython's run(), is also a little complex but is easier to follow. Moreover, it provides the expected behaviour. We should model run() more closely on the reference implementation, diverging only where Java requires it.

Against this, we have made some changes to run() specifically to support ipython, and change risks breaking that. (https://hg.python.org/jython/rev/f08249d267c7) Possibly the requirement is simply to let a client program "type at the prompt" -- there's no .py file to execute, so it is "interactive" in that sense, yet should be line-oriented. It's not clear we do the right thing anymore in these circumstances, which are incidentally also those of #2525.
msg12088 (view) Author: Jeff Allen (jeff.allen) Date: 2018-08-20.08:10:31
I've been trying to port CPython Modules/main.c::main() into util/jython.java, using 2.7.15 as my reference, as the best way to sort out our logic. Along the way I've written an options scanner based on CPython Python/getopt.c that has potential for wider use (if we ever needed another command-line tool).

My main aim is to make our choices about interactivity and buffering the same as CPython's. This doesn't directly remove our dependence on isatty(), but I have noticed that Py_FdIsInteractive does more than just call isatty(), which may be a clue how we handle not being able to use it under Java 9 without being told-off. This, and the handling of buffering, are still the trickiest aspects to handle AFAIK, even after looking at CPython.

There's an interesting watershed in the middle of CPython main() where Py_Initialize() is called to bring up the type system and create the first interpreter. Before this point, CPython is careful (in 2.7) not to use any PyObjects, and after this point PyObjects and the interpreter may be handled reliably. It is also careful to have main() deal with settings that are command-line only issues, and initialize() deal with those from environment variables that also affect embedded Python. Not all the landmark structures (sys.argv, sys.path) have their proper values until main() is ready to run whatever code has been indicated.

We do not achieve this clean switch (which itself follows a controlled sequence in CPython): some actions are performed using methods in Py and PySystemState before PySystemState.initialize() is called, so the static initialisation that entails is what brings up at least the Python type system. This might be my fault. But even if these calls did not occur, the very fact that it is PySystemState.initialize() we call, causes the static initialisation of that class before initialize() runs. When working on PyType I noticed creation was triggered from different places for embedded and "jython" start-up.

It would stabilise this situation if we were to defer all static initialisation of PySystemState until invoked by initialize(), and to create a bolt-hole for utility methods, that is not Py, if they might be needed before the watershed. I believe I can see the shreds of this design principle in PySystemState, but if it was documented anywhere, then I missed it.

C doesn't have this static initialisation proprty, but it is interesting to note that Py_Initialize() and its relatives are in pythonrun.c not the sys module.

We have a couple of extra things to deal with: the Jython registry and running Python code from a JAR, but I think CPython shows us how to incorporate these by analogy with environment variables and the runnable zip.

---
A couple of interesting references:
https://www.python.org/dev/peps/pep-0432/ PEP 432 Restructuring the CPython startup sequence (only applies directly to 3.x).
https://wiki.python.org/moin/CPythonInterpreterInitialization Python wiki contains an interesting analysis of the sequence of events during startup, based on 3.x before PEP 432, but close to what's observable in 2.7.15.
msg12089 (view) Author: Jeff Allen (jeff.allen) Date: 2018-08-21.08:00:19
Here's an interesting way to find out whether stdin (and stdout) are interactive, that I hadn't noticed before:

PS jython-trunk> dist\bin\jython -c "from java.lang import System; print repr(System.console())"
java.io.Console@5a2fa51f
PS jython-trunk> echo hello | dist\bin\jython -c "from java.lang import System; print repr(System.console())"
None
PS jython-trunk> dist\bin\jython -c "from java.lang import System; print repr(System.console())" > x.tmp
PS jython-trunk> type x.tmp
None

Although it is a little difficult to follow internally, through SharedSecrets, it is clear that in the end Java calls an internal private istty() that determines whether the static console variable will be null or an instance of Console.class, so it is not simply a quirk of one platform or version.
msg12090 (view) Author: Jeff Allen (jeff.allen) Date: 2018-08-27.06:47:41
There are quite a few wrinkles here that this potentially irons out. E.g. if you run a script with the -i flag, you're supposed to end up in an interactive session, even if an exception is raised. This is clearly useful, and not what happens in Jython.

However, deeper difficulties are not so much about the structure of jython.run() itself as about the relationships amongst interpreter, thread and system states, which are somewhat different from CPython. It seems (Jython) PySystemState does the job of both (CPython) PyInterpreterState and the sys module, so initialize() doesn't create a distinct primary interpreter. Elsewhere we make a PythonInterpreter that looks a bit like (CPython) PyInterpreterState, but is also roughly equivalent to the PyRun_* C-API.

Mainly I had hoped to sort out the isatty/interactive muddle, avoiding isatty(fd) (to satisfy Java 9), and maybe fix #2525 and #2305 as a side effect. I'm getting there, but just copying CPython isn't really possible while we diverge in our lifecycle objects. I'm gradually finding a synthesis of the CPython main() and current Jython run().
msg12091 (view) Author: Jeff Allen (jeff.allen) Date: 2018-09-03.19:41:57
Getting there, one of the signs being that "inspect" is now done right:

Current tip:
PS jython-trunk> dist\bin\jython -i -c"a=6*7; exit()"
PS jython-trunk>

My uncommitted work:
PS jython-jvm9> dist\bin\jython -i -c"a=6*7; exit()"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Jeff\Documents\Eclipse-O\jython-jvm9\dist\Lib\site.py", line 394, in __call__
    raise SystemExit(code)
SystemExit: None
>>> a
42
>>> exit()
PS jython-jvm9>

Which is identical to CPython. What I have is a curious synthesis of CPython and the original code, but it seems to work and I believe will be comprehensible to us next time we have to touch it. I still need to try this with a fancy console (JLine) and support -jar.
msg12115 (view) Author: Jeff Allen (jeff.allen) Date: 2018-09-20.09:11:48
I have this in a good state now and think I've shaken out a lot of latent bugs and divergences from CPython. Hopefully, next time we have to work on this logic, we'll understand it easily, and have a comparator in CPython. (It's confused me from the start, and I know I made it subtly worse.)

I'll commit as it now is, but I have the following to-do list:

1. Actually remove SystemRestart (as discussed on jython-dev). What I did was restore it, half-working, but only so I can kill it in one easily-identifiable change.

2. Factor out (into PyOS.java?) utility and OS-like things it is safe to do before Python types exist. I am influenced here by https://www.python.org/dev/peps/pep-0432/ . At the moment, too many things are in Py.java and PySystemState, which, as soon as you touch them, bring the whole Python type system prematurely to life.

3. Implement os.environ and the registry in pre-Python space so we can use them for configuration before starting Python.

4. Reconcile the several ways of deciding if the console is interactive into one (pre-Python) strategy.

5. Check I've not broken Jython on Linux.

6. Check all this in an environment with non-ascii installation and user paths.
msg12119 (view) Author: Jeff Allen (jeff.allen) Date: 2018-09-23.14:01:34
From my to-do list, #3 turns out to be a bad idea (although centralising access to System properties is good), and I'm ready for #5, which I find easiest to do by pushing the changes passing on Windows, and pulling them down on Linux.

5 change sets leading up to this one: https://hg.python.org/jython/rev/fbe8e11c24c8

The remaining annoyance relates to JYTHONPATH. I have raised #2706 to give us a chance to review.
msg12122 (view) Author: Jeff Allen (jeff.allen) Date: 2018-09-25.07:37:16
#5 Yes, I have broken Jython on Linux. There's a bug in my handling of -Dkey=value.

This is easily fixed. It doesn't emerge on Windows because the launcher jython.py collects -D and -J-D options and makes them all Java -D definitions. That design also changes the semantics slightly, relative to the shell script, as *Java* -D definitions (supplied by -J-D) become pre-properties in initialisation, while -D definitions given to *Jython* directly become post-properties, so take ultimate precedence in the registry.

However, jython.py appears to have been broken on Linux for some time by the shebang line. (#2707 raised.)

jython.py is also slightly broken on both Linux and Windows in the way it handles -c: it fails always to preserve the option argument (the code) as a single string by quoting. If I work around #2707 and fix the -D bug

For me, in a developer-built Jython, dist/bin/jython means the shell script, but in test_jython_launcher, jython.py is explicitly used. Then you can see the command string, having lost its quotes, is treated as a series of arguments not as a single option to -c. I'd observed some weirdness on Windows, but it varied between posh and cmd, and I put it down to quote-handing in the shell. I may have broken this when refactoring the launcher for unicode.

Grrr. Every time I get close to closing a bug I seem to have to open two more.
msg12124 (view) Author: Jeff Allen (jeff.allen) Date: 2018-09-27.08:11:49
The problem with quoting the argument to -c only occurred in composing the output from --print.
So the test failed, but Jython actually worked. In a previous round of change I had simplified processing, thinking I was only interested in making a windows launcher. However, this is now fixed at:
https://hg.python.org/jython/rev/38824a8816a8

This now works on my Linux machine, even when launched as:

$ dist/bin/jython.py -m test.regrtest -e

That way one ends up with sys.executable = 'dist/bin/jython.py' and so it gets used all over the place, not just the launcher test.

I've fixed #2707 in the same change, at least, I've removed the -E for want of anything clever. Circle CI likes us again, but the other CI bots are sulking. I still have to check this with non-ascii paths.
msg12207 (view) Author: Jeff Allen (jeff.allen) Date: 2018-12-16.15:21:31
I have now checked this with non-ascii paths (a Chinese user name), on Windows, where we had trouble before. Limitations in ANTLR (I think, maybe Ant) prevent us building so I did this via the installer.

The launcher does not work correctly or the JLine console unless the PC is properly localised through the control panel and a restart. (I notice TEMP and TMP are invalid.)

However, properly localised this all goes fine and any test failures I'm seeing are attributable to other things (like callbacker.jar missing). So I declare success on this.
History
Date User Action Args
2019-01-02 09:12:26jeff.allensetstatus: pending -> closed
2018-12-16 15:21:31jeff.allensetstatus: open -> pending
resolution: accepted -> fixed
messages: + msg12207
2018-09-27 08:11:49jeff.allensetmessages: + msg12124
2018-09-25 07:37:17jeff.allensetmessages: + msg12122
2018-09-23 14:01:34jeff.allensetmessages: + msg12119
2018-09-20 09:11:49jeff.allensetresolution: accepted
messages: + msg12115
milestone: Jython 2.7.2
2018-09-03 19:41:59jeff.allensetmessages: + msg12091
2018-08-27 06:47:43jeff.allensetmessages: + msg12090
2018-08-21 08:00:20jeff.allensetmessages: + msg12089
2018-08-20 08:10:32jeff.allensetnosy: + zyasoft
messages: + msg12088
2018-08-13 19:58:07jeff.allenlinkissue2656 dependencies
2018-05-17 19:50:41jeff.allencreate