|Recipients||darjus, jamesmudd, jeff.allen, stefan.richthofer, zyasoft|
|Marked as misclassified||Yes|
I agree with Jeff's analysis. Clearly the necessary cycle of waiting to get a deadlock was caused by what #2536 identified, finally blocks for releasing internal locks not being called due to out of memory errors. (This scenario violated the contract that we assumed through the usage of ConcurrentHashMap.) Other than that issue, we have had no reports of other deadlocks in this specific code; and I also do not see any deadlock path in the use of class_to_type. We should also synchronize fromClassSkippingInners. This change restores correctness. Next, let's look at our caching, and its corresponding performance. Using weak keys/weak values in class_to_type means that when working with Java objects (calling methods, accessing attributes), Python code that does not maintain a strong reference to the Java class will cause the Jython runtime to constantly rebuild these entries in class_to_type. Here are the ways that such strong references would be established: 1. exposedTypes (so Jython's internals) https://hg.python.org/jython/file/1a14d360dc8d/src/org/python/core/PyType.java#l104 2. imports of Java classes into a Python namespace 3. references to `type(java_obj)` 4. subclassing of a Java class by a Python class References to Java *objects*, including objects from factory methods; indirect construction, such as `java.util.HashMap()`; and callbacks (Netty, re/json/strptime) would not introduce such references to Python types for Java *classes*. But calling methods, using attributes, or otherwise using getting the Python type on the Java object, would require this entry in class_to_type to be added, potentially to be potentially quickly discarded by generational GC. Computing these entries is relatively expensive, especially if a given class C has inner classes (and this is a general type graph expansion). See also the analysis in https://bugs.openjdk.java.net/browse/JDK-6389107 (scenario 3); our mapping in class_to_type is a common pattern seen in Java systems. What if instead of current one-level cache, we had a two-level cache? Level 1 uses weak key/weak value for the class/type entries (C, T); (C, T) entries that are expired from Level 1 are moved to Level 2; Level 2 uses an expiration time so a ClassLoader could be unloaded in say some reasonable amount of time. To implement: 1. Level 1 cache is a direct replacment of class_to_type with CacheBuilder weak keys/weak values/removal listener. Also we will not attempt to expose it as a Map going forward; adjust class_to_type usage accordingly. 2. Level 2 cache is weak keys/strong values/expires after write. We can tune this expiration as was done with the regex cache and the python.sre.cachespec option, https://github.com/jythontools/jython/blob/master/src/org/python/modules/sre/SRE_STATE.java#L1253). 3. For removalListener: (C, T) entries that are removed from Level 1 will use removalListener to get placed in Level 2 (safe because Level 1 is concurrent; and obviously hard refs would prevent it from being removed while under construction). 4. Try both caches when looking up the type T for class C. Because synchronized, there's no visible inconsistency; if no entry for C because it was removed, simply recompute T'. There is a small window of time that it could be in the process of being moved to Level 2, but this does not affect correctness; (C, T) will be eventually expired from Level 2 and (C, T') is a valid repllacement for it (because there can be no hard refs to T outside the level 2 cache itself). Lastly, https://docs.oracle.com/javase/7/docs/api/java/lang/ClassValue.html looks potentially useful as an alternative for publishing PyType objects into a cache. But we need to address two key questions: 1. Whether ClassValue#computeValue would work properly in the re-entrant case where the class graph from a given class C is explored, as is done for inners, possibly referring back to C. This is why we could not get the publication of the type in the map done after the init (as one would usually want to do!) in https://github.com/jythontools/jython/blob/master/src/org/python/core/PyType.java#L1515 (putIfAbsent); class C would be referenced, and it would attempt to build again (and stack overflow). No solution I tried could break this problem. 2. When to call ClassValue#removeValue ! We still have to keep track of the cache with respect to PyType.
|2017-08-05 19:34:58||zyasoft||set||messageid: <firstname.lastname@example.org>|
|2017-08-05 19:34:58||zyasoft||set||recipients: + zyasoft, jeff.allen, darjus, stefan.richthofer, jamesmudd|
|2017-08-05 19:34:58||zyasoft||link||issue2487 messages|