Issue2701

classification
Title: JVM seg Faulting when running standalone jython under a child first class loader
Type: crash Severity: normal
Components: Core Versions: Jython 2.7
Milestone: Jython 2.7.0
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Hardy, stefan.richthofer
Priority: Keywords:

Created on 2018-08-09.22:09:26 by Hardy, last changed 2018-08-10.22:25:34 by stefan.richthofer.

Files
File name Uploaded Description Edit Remove
hs_err_pid14832.log.txt Hardy, 2018-08-09.22:10:18
hs_err_pid10516.log.txt Hardy, 2018-08-09.22:10:49
Messages
msg12074 (view) Author: Hardy Bhatia (Hardy) Date: 2018-08-09.22:09:43
Hello,

We have an application which hosts plugins ( A plugin is a mixture of Jython scripts and some jars containing java classes). Since plugin code can potentially bring in java types with same names but of different versions and since PythonInterpreter instances even with different PySystemState objects are not 100 % independent of each other. We load each plugin along with Jython.jar under a dedicated URLClassLoader derived custom Child first class loader instance, which delegates type resolution to be routed to these jars or other child first class loader instances associated with this plugin before routing to class loaders higher up the food chain.

Once the child class loaders for the plugins have been constructed on lets call it Thread A.

We load a java class using one of these child first class loader (non Jython one), which creates preset type and which creates a dedicated worker thread for jython interpretation (lets call this the worker thread).

we call PySystemState.initialize (with some of the java jars on the syspath) and create a PythonInterpreter instance on the worker thread. Then start executing our scripts.

In order to unload a plugin.
I close the interpreter.
set interpreter locals to Py.newStringMap()
close all SyspathArchive found at systemState.path
Followed by Py.getSystemState().close()
wait for the worker thread to exit.

Once the worker thread has exited we then call close on the ChildFirstClassLoader (just invokes URLClassLoader's implementation) or let it get garbage collected, i have experimented with both.

We run an instance of the classloader leak prevention library to host the childfirstclassloader so various cleanups are performed which may otherwise leak the classloader instances.
https://github.com/mjiderhamn/classloader-leak-prevention

For most part this works quite well but every now and then i am finding though that the process seg faults due to what appears to be something related to jnr-ffi library's usage in Jython.
 
It appears Jruby also suffered from the same problem due to this library.
https://github.com/jruby/jruby/issues/4506

I am attaching along the JVM crash dumps from our experiments. I am using Jython 2.7.0.

Any suggestions how to address this issue or what is causing this ? 

PS : Even any information/explanation on how the native library life cycles are expected to work in Jython's implementation will be a great help.
msg12075 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2018-08-10.12:54:31
According to https://github.com/jruby/jruby/issues/4506 it seems like a solution is on the way. Since jnr-ffi is an external library, you might better push for a solution in their issue tracker. This looks like our best bet is to update Jython's jnr-ffi version as soon as a fix for this issue is available.

That said, you should in general better work with Jython 2.7.1 or whatever is the newest version available. There certainly won't be a fix in 2.7.0. However if a fixed jnr-ffi is available it's a rather small modification if you like to build old 2.7.0 just with this dependency updated. Still 2.7.1 or newer is recommended for various other improvements.
msg12077 (view) Author: Hardy Bhatia (Hardy) Date: 2018-08-10.21:07:30
Hello Stefan,
Thank you for your suggestion. I did try to drop in Jython 2.7.1 instead of 2.7.0 and conduct some tests and it appears that couple of our client plugins which work correctly against 2.7.0 were broken as a result of the move. This may be due to the presence of cached/compiled $pyc files in some directories. I am a little nervous about migrating up.
I am wondering though if it would be better to recompile 2.7.0 myself with some additional hooks/methods somewhere which will allow releasing the native jffi libraries deterministicly before the plugin class loader is closed. Perhaps something i can invoked after closing the interpreter and the PySystemState object. Any thoughts or suggestions ?
msg12080 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2018-08-10.22:25:33
First a hint on upgrading a single dependency:
To upgrade jnr, replace the jar in extlibs folder with the version you intend. Then search build.xml for the name of the old jar and replace each occurrence with the new name. Then run ant build, or ant jar-standalone or ant installer or whatever target you need.

Regarding hooks:
I don't have capacity to get into jnr internals now. If you know what hook you'd like to add, which method you need to call at which point we can iterate again.

Regarding Jython 2.7.1:
I recommend to keep investigating an update path. File issues here if it's Jython's fault breaking your plugins. If it has to do with pyc files -- I'm surprised you rely on pyc files -- an important change is that Jython 2.7.1 now supports (and requires) Python 2.7 bytecode while Jython 2.7.0 still  used Python 2.5 bytecode. So maybe you just need to rebuild your pyc files with CPython 2.7. pyc bytecode underlies backwards incompatible change between Python 2.5 and 2.7.
History
Date User Action Args
2018-08-10 22:25:34stefan.richthofersetmessages: + msg12080
2018-08-10 21:07:31Hardysetmessages: + msg12077
2018-08-10 12:54:32stefan.richthofersetnosy: + stefan.richthofer
messages: + msg12075
2018-08-09 22:10:50Hardysetfiles: + hs_err_pid10516.log.txt
2018-08-09 22:10:22Hardysetfiles: + hs_err_pid14832.log.txt
2018-08-09 22:09:51Hardysetfiles: - bug statement.txt
2018-08-09 22:09:43Hardysetmessages: + msg12074
2018-08-09 22:09:26Hardycreate