Issue1786

classification
Title: sys-package-mgr does not close file handles properly
Type: crash Severity: normal
Components: Core Versions: 2.5.1
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: fwierzbicki Nosy List: fwierzbicki, jeff.allen, otmarhumbel, richardfearn, spider
Priority: normal Keywords:

Created on 2011-08-05.07:42:09 by spider, last changed 2011-12-31.18:35:45 by fwierzbicki.

Files
File name Uploaded Description Edit Remove
Issue1786 jeff.allen, 2011-08-13.18:48:07 Patch addressing Issue1786
Messages
msg6597 (view) Author: Matthew Webber (spider) Date: 2011-08-05.07:42:08
When our application (Java + Jython 2.5.1) started, we saw the following in the console log:

    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.gtk.linux.x86_64_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.gtk.solaris.sparc_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.gtk.solaris.x86_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.aix.ppc_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: can't write cache file for '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.aix.ppc_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.hpux.ia64_32_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: skipping bad jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.hpux.ia64_32_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.linux.x86_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: skipping bad jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.linux.x86_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: processing new jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.solaris.sparc_3.6.2.v3659b.jar'
    [junit] *sys-package-mgr*: skipping bad jar, '/scratch/jenkins_workspace/GDA_8.14.tests.RH5_64_bad-0/thirdparty/eclipse/plugins/org.eclipse.swt.motif.solaris.sparc_3.6.2.v3659b.jar'
 
The key messages here are "can't write cache file" followed by lots of "skipping bad jar". The application crashed shortly thereafter with
java.io.FileNotFoundException: <path>/InterfaceProvider.class (Too many open files)

If we disabled the Jython cache (we removed write permissions from the cache directory), then everything ran without problems.

We managed to reproduce the problem on our test system, and the problem is clearly _not_ with the specific jars that report "can't write cache file" or "skipping bad jar". We were running with the default "ulimit -n" of 1024, and if this was increased to 2048, the problem no longer occurred.

So, a couple of observations:
(1) The text "skipping bad jar" is not the correct description of the problem. I guess that this gets reported on any exception being thrown - this makes problem diagnosis difficult. It would be better if the error message gave more details about why it failed.

(2) I wonder if the *sys-package-mgr* code is cleaning up it's file handles properly. See the last comment in http://stackoverflow.com/questions/2272908/java-io-filenotfoundexception-too-many-open-files, and also http://coldfused.blogspot.com/2007/02/mystery-of-too-many-open-files.html.
msg6600 (view) Author: Jeff Allen (jeff.allen) Date: 2011-08-13.14:30:32
I have been investigating this with Process Explorer on Windows 7 and can see whathe problem is. I intend to offer a (simple and obvious) patch when I can get the mechanics right -- hg, roundup etc. all new to me. (JA)
msg6601 (view) Author: Jeff Allen (jeff.allen) Date: 2011-08-13.18:48:07
Analysis:

In org.python.core.packagecache.CachedJarsPackageManager.addJarToPackages, jar files being indexed on first use are not explicitly closed. The patch attached corrects that (in one line).

Jython is creating an index for each jar in its classpath. The file descriptors become unreferenced, and will be garbage collected, but not soon enough to avoid FD exhaustion in some systems. An explicit close is required.

I have developed this on v2.5.1, the product reported at fault. However, the same code is in multiple versions and the 2.6 tip. I am using Windows 7 x64, which seems never to run out of fds.

Jython already has open (and keeps open) seemingly the same jar files as a side effect of initialisation: I assume with good reason. A superfluous jars on the classpath will strain resources, but at least now we only have one copy open.

Test:

I have not been able to make an automated test but I have observed the behaviour with Process Explorer on Windows 7 x64 and JDKs 1.5.0_22 and 1.6.0_26. The 1.6 JVM uses significantly fewer handles (around 240 peak) compared with the 1.5 (around 420) in the patched code.

Testing on a system that actually demonstrates fd exhaustion would seem a wise precaution but is not open to me.

I have run the regression test (ant clean regrtest) on the modified product with the same results for the patched code as on a clean v2.5.1 check-out. (That is 306 passes, 3 skipped, 2 fails, 2 unexpected.) The 2 failures are test_cmd_line (hangs, needs killing 3 times) and test_socket.

Action:

Would a project member rights to commit please take this up? This is my first contribution to Jython. If there is any other ceremony needed, or a different process, please let me know.
msg6604 (view) Author: Jeff Allen (jeff.allen) Date: 2011-08-16.17:35:40
A possible work-around is to take more explicit control of the JARs Jython attempts to cache through properties. There's documentation here:
http://jythonpodcast.hostjava.net/jythonbook/en/1.0/ModulesPackages.html#how-jython-finds-the-jars-and-classes-to-scan
msg6701 (view) Author: Jeff Allen (jeff.allen) Date: 2011-11-04.08:38:27
I posted a patch for this. Would someone with commit rights please either commit it or explain what I still have to do?

Or you could invite me to be a committer, although I hear inexperienced hg users can make a bit of a mess.
msg6747 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2011-12-31.18:34:53
Fixed in 2.5 maintenance and default branch (targeting 2.6) Thanks Jeff Alan!
msg6748 (view) Author: Frank Wierzbicki (fwierzbicki) Date: 2011-12-31.18:35:45
I mean thanks Jeff Allen! I did get it right in the commit message :)
History
Date User Action Args
2011-12-31 18:35:45fwierzbickisetmessages: + msg6748
2011-12-31 18:34:54fwierzbickisetstatus: open -> closed
priority: normal
resolution: fixed
messages: + msg6747
2011-12-14 16:04:21fwierzbickisetassignee: fwierzbicki
2011-11-04 08:38:28jeff.allensetmessages: + msg6701
2011-09-04 17:11:29richardfearnsetnosy: + richardfearn
2011-08-16 17:35:40jeff.allensetnosy: + otmarhumbel, fwierzbicki
messages: + msg6604
2011-08-13 18:48:08jeff.allensetfiles: + Issue1786
messages: + msg6601
2011-08-13 14:30:33jeff.allensetnosy: + jeff.allen
messages: + msg6600
2011-08-05 07:42:09spidercreate