Message6524

Author adam.spiers
Recipients adam.spiers
Date 2011-05-09.12:16:49
SpamBayes Score 7.639317e-11
Marked as misclassified No
Message-id <1304943410.28.0.332676629694.issue1746@psf.upfronthosting.co.za>
In-reply-to
Content
There is a PermGen memory leak caused by the combination of
codecs.java and the CPython Lib's encodings/__init__.py.  The code in
question is:

    # Register the search_function in the Python codec registry
    codecs.register(search_function)

which adds the search_function PyFunction to
org.python.core.codecs.searchPath which is a static PyList.  Each of
these PyFunctions has a func_globals StringMap containing a key
'codecs', whose value is a PyModule whose __dict__ is a StringMap
containing a key 'sys' (because CPython Lib's codecs.py imports
'sys'), whose value is a PySystemState instance.  So for every Jython
thread which imports encodings, that thread's PySystemState and many
related objects will never be garbage collected.  After enough Jython
threads (only around 20 in our case), PermGen is exhausted resulting
in an OutOfMemoryError.

Our workaround has been to modify codecs.java to check whether
search_function is already in the PyList and if not, avoid adding it
again, but I'm not sure whether this is optimal.  Maybe a better
solution would be to ensure that it is removed from the PyList during
thread cleanup - but only if this could be done without using
finalizers, which according to Joshua Bloch and others should be
avoided if at all possible.
History
Date User Action Args
2011-05-09 12:16:50adam.spierssetrecipients: + adam.spiers
2011-05-09 12:16:50adam.spierssetmessageid: <1304943410.28.0.332676629694.issue1746@psf.upfronthosting.co.za>
2011-05-09 12:16:50adam.spierslinkissue1746 messages
2011-05-09 12:16:49adam.spierscreate