Issue1522

classification
Title: repeated execution of external python scripts causing PermGen out of memory exception
Type: crash Severity: critical
Components: Core Versions: 2.5.1, 2.5b1, 2.5b0, 25rc4, 2.5.0
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pjenvey Nosy List: adam.spiers, amak, bpedman, matt_brinkley, pjenvey, rkurin, zyasoft
Priority: Keywords: patch

Created on 2009-12-08.21:03:09 by rkurin, last changed 2010-07-14.16:23:39 by zyasoft.

Files
File name Uploaded Description Edit Remove
issue1522-weakrefmap.patch zyasoft, 2010-06-22.05:43:44
class_to_type.patch zyasoft, 2010-06-25.21:52:13
Test1522.java zyasoft, 2010-06-25.21:54:04
test1522.py zyasoft, 2010-06-25.21:54:35
Messages
msg5363 (view) Author: Roman Kurin (rkurin) Date: 2009-12-08.21:03:07
Executing the python script repeatedly that imports another python class
causing OOM: PermGen exception.

After investigating this further the class org.python.core.PyType has a
static "class_to_type" map that never gets cleaned up and holds
references to generated class files and that does not allow those
classes to be clean up from PermGen. 

class_to_type has thousands of references to PythonPOSIXHandler$<some
number>. I believe the problem happens when python script imports a
class defined in python. 

The code that causes the OOM after about 30 min with -XX:MaxPermSize=32m
is included. 

Java class with the main method to run: 

public class JythonProblemTest {
	public static void main(String[] args) throws Exception
	{
		while(true)
		{
			PythonInterpreter inter = null;
			try
			{
				inter = new PythonInterpreter(null, new PySystemState());
				inter.execfile("python_src/test.py");
			}
			finally
			{
				if (inter!= null) 
				{
					inter.cleanup();
				}
			}
		}
	}
}

---------------------------------------------------------------

Python script to be executed by Java main method: 

from os import path

class MyClass(Exception):
    '''
    classdocs
    '''
    def __init__(self):
        pass

if __name__ == '__main__' or __name__ == 'main':
    pass
msg5810 (view) Author: Matt Brinkley (matt_brinkley) Date: 2010-06-15.22:20:04
I looked into this a bit, since I'm also having lots of problems with permgen space, classloader leaks etc.

I found that this is related to python classes that implement java interfaces - the root cause in this example is the "import os" call - the os module has a python class called PythonPosixHandler that implements a java interface (org.python.posix.PosixHandler)

I found that each run of the while loop below causes the size of PyType.class_to_type to increase by one: an additional entry for PythonPOSIXHandler is added, with a monotonically increasing number appended (this is added in MakeProxies.makeProxy - you can see that there is a static int that is incremented and appended) ... it seems to me that this is the root of (one of the) problem(s): the hash map lookups in class_to_type always result in a cache miss because each time the proxy is generated it has a unique name.

class_to_type seems very problematic in dynamic classloader environments (where classloaders can come and go, e.g. tomcat servlet reloading or OSGi) - classes are added to this hash map but never cleared out, and the hash map is global across all interpreter instances, so there is no way to clean it out or otherwise prevent it from growing without bound. The code here is complex enough that I can't figure out a good way to deal with this (other than adding an explicit "clearCache()" method, which would be fine by me) but maybe someone can help explain what class_to_type is used for, why the proxy names have a monotonically increasing int appended to them, etc.
msg5811 (view) Author: Matt Brinkley (matt_brinkley) Date: 2010-06-15.22:33:02
Also note that this seems very similar to the issue I pointed out in bug http://bugs.jython.org/issue1327 which involved a python class implementing a java interface - when the classloader is reloaded (in Tomcat) and the python is re-evaluated, the base class is the java class from the *old* classloader .... I strongly suspect class_to_type is the culprit here as well.
msg5812 (view) Author: Adam Spiers (adam.spiers) Date: 2010-06-15.23:20:54
Hi Matt,

Amazing coincidence that you are looking at this at the same time as me!  Although I do see PythonPOSIXHandler duplication, here the main cause of the leak is repeated compilation of the same Jython code via BytecodeLoader, resulting in class_to_type having thousands of keys which are of the form org.python.proxies.foo$bar$<some int>.  I'm not yet sure why compilation is happening repeatedly, nor why Jython takes this monotonically increasing integer approach.

Anyway, here we reached the same conclusion as you, and have in fact hacked PyType, PySystemState, and PythonInterpreter, to track these PyTypes when they are added to this map, and then remove them during PythonInterpreter.cleanup().  This seems to work OK, as long as you remember to invoke that method after every usage of PythonInterpreter!
Maybe doing this from PythonInterpreter.finalize() would be even better.
msg5813 (view) Author: Adam Spiers (adam.spiers) Date: 2010-06-16.11:59:30
I think I was wrong about the repeated compilation.  According to my debugger, makeProxy() generates a new org.python.proxies.foo class every time the code is run, not every time it is compiled.  I don't know why this is necessary, but I think this should mean these proxies "belong" to a unique PythonInterpreter and that it's safe to remove these proxies from PyType.class_to_type during finalization of that PythonInterpreter.  I'll test that now.

I would like to echo Matt's request for an explanation of how this is all supposed to work, because this part of the Jython code is extremely complex.  Is there a Jython internals document available anywhere?
msg5829 (view) Author: Jim Baker (zyasoft) Date: 2010-06-22.05:43:44
I read this bug with interest because any such memory leak bug is critical, and thus blocks 2.5.2. When I first looked at it, I only noticed the mapping of Java classes to Python types in our expose (a relatively small set of classes); however, on second look, it's also of course used by for our general Java proxy support. Hence the concern.

It would seem to me the easiest thing to do is use a WeakHashMap to track this - we only care to keep this around in PyType#class_to_type when there are still objects out there using these classes (as referenced by PyType#underlying_class, or perhaps other references).

I have attached a very small patch that passes the regrtest. Essentially it just does this change in the initialization of the class_to_type map:

class_to_type = Generic.synchronizedWeakHashMap();

Please try this and see if it addresses these memory leaks.
msg5831 (view) Author: Matt Brinkley (matt_brinkley) Date: 2010-06-23.17:38:39
I tried this out and it didn't work. After applying the patch I created a test app that recreates the problem: it is a java program that creates a PythonInterpreter that runs a python script which contains a python class that implements a java interface. After running the the script I print out the size of class_to_type, do an explicit System.gc(), and print out the size again. This whole process it done in a loop, and the output indicates that the size of class_to_type increases by one on each iteration.

I looked at the source again and I think I know why this didn't work: The weak hash map only contains weak references to the *key*, not the value (it contains a normal strong reference to the value.) The javadocs for WeakHashMap indicate that if the value has a strong reference to the key then the weak hash map will never release that entry, because the (key has a strong reference through the value back to the gc root.) This is exactly the case here: the PyJavaType value contains a reference to the key in its member variable "javaProxy" (you can see this assignment in PyJavaType.java:203)
msg5832 (view) Author: Jim Baker (zyasoft) Date: 2010-06-23.20:31:48
I committed the patch in r7067
msg5833 (view) Author: Jim Baker (zyasoft) Date: 2010-06-23.20:41:32
Matt, just saw your comment. Thanks for the analysis, we will continue to look at this. You may also want to join the discussion on #jython at freenode
msg5834 (view) Author: Jim Baker (zyasoft) Date: 2010-06-24.05:31:08
Matt, would it be possible for you to submit your memory leak test here? It would be a useful as a start to a test that we could ultimately add to our regrtest suite
msg5835 (view) Author: Matt Brinkley (matt_brinkley) Date: 2010-06-24.23:34:34
Here is the test program I used. One thing to note is that I added an accessor method for PyType.class_to_type so I could print out its size.

-------------------------------------------------------------------------
import org.python.util.PythonInterpreter;
import org.python.core.PySystemState;
import org.python.core.PyType;

public class JythonTestMain
{
    public static void runScript() throws Exception
    {
        PySystemState sys = new PySystemState();
        PythonInterpreter pi = new PythonInterpreter(null, sys);
        pi.execfile("/path/to/test.py"));
        pi.cleanup();
    }

    public static void main(String[] args) throws Exception
    {
        //(have tried this for various sizes of 'n')
        for(int i=0; i<2;i++)
        {
            runScript();
            System.out.println("class_to_type size:" + PyType.getClassToType().size());
            System.gc();
            System.out.println("class_to_type size (post-gc):" + PyType.getClassToType().size());
        }
    }
}

--------------------------------------------------------------------
(Here is the contents of test.py ... the important feature it must have is that it must contain a Python class that implements a Java interface.)

from java.lang import Comparable

class Dog(Comparable):
    def compareTo(self, o):
        return 0
    def bark(self):
        print "woof woof"

print Dog().bark()
msg5838 (view) Author: Jim Baker (zyasoft) Date: 2010-06-25.21:52:13
I have attached a patch that fixes this problem, using guava (which includes Google collections). It makes both the key and the value in class_to_type be weak. Interestingly, as built it breaks some imports, not certain why. If guava is imported separately (not in extlibs), there are no issues, so it's probably just a packaging issue.
msg5839 (view) Author: Jim Baker (zyasoft) Date: 2010-06-25.21:54:04
A version of Matt's test, but uses reflection to get at class_to_type (can't do this in Jython because PyType is *in* class_to_type!)
msg5840 (view) Author: Jim Baker (zyasoft) Date: 2010-06-26.04:04:37
Committed a Python version of Matt's test in r7071, as part of our regrtest suite
msg5841 (view) Author: Jim Baker (zyasoft) Date: 2010-06-26.06:07:51
Should be fixed by r7072, but complex enough that others should verify this too.
History
Date User Action Args
2010-07-14 16:23:39zyasoftsetstatus: pending -> closed
2010-07-02 17:59:40amaksetnosy: + amak
2010-06-26 06:07:52zyasoftsetstatus: open -> pending
resolution: fixed
messages: + msg5841
2010-06-26 04:04:38zyasoftsetmessages: + msg5840
2010-06-25 21:54:35zyasoftsetfiles: + test1522.py
2010-06-25 21:54:04zyasoftsetfiles: + Test1522.java
messages: + msg5839
2010-06-25 21:52:15zyasoftsetfiles: + class_to_type.patch
messages: + msg5838
2010-06-24 23:34:36matt_brinkleysetmessages: + msg5835
2010-06-24 05:31:09zyasoftsetmessages: + msg5834
2010-06-23 20:41:32zyasoftsetmessages: + msg5833
2010-06-23 20:31:48zyasoftsetmessages: + msg5832
2010-06-23 17:38:41matt_brinkleysetmessages: + msg5831
2010-06-22 05:43:47zyasoftsetfiles: + issue1522-weakrefmap.patch
keywords: + patch
messages: + msg5829
nosy: + zyasoft
2010-06-16 11:59:32adam.spierssetmessages: + msg5813
2010-06-15 23:20:55adam.spierssetmessages: + msg5812
2010-06-15 22:33:03matt_brinkleysetmessages: + msg5811
2010-06-15 22:20:07matt_brinkleysetnosy: + matt_brinkley
messages: + msg5810
2010-05-13 16:39:34bpedmansetnosy: + bpedman
2010-05-12 22:22:54adam.spierssetnosy: + adam.spiers
2010-04-13 00:37:50pjenveysetassignee: pjenvey
nosy: + pjenvey
2009-12-08 21:03:09rkurincreate