Issue2224

classification
Title: Attributes attached to PyObjects via weak hash maps break if PyObject is resurrected
Type: behaviour Severity: urgent
Components: Core Versions: Jython 2.7
Milestone:
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: stefan.richthofer
Priority: Keywords:

Created on 2014-10-23.16:31:53 by stefan.richthofer, last changed 2015-02-11.17:09:33 by zyasoft.

Files
File name Uploaded Description Edit Remove
JyAttribute.java stefan.richthofer, 2014-10-23.16:31:52 Draft of the attribute support
test_Jyattr.py stefan.richthofer, 2014-10-23.22:21:48 unit tests demonstrating the issue for id and weakref
Messages
msg9164 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-23.16:31:51
There are several scenarios in Jython where one needs to attach some attribute to a given arbitrary PyObject. Typically, only few PyObjects will get this attribute attached, so adding a new field to PyObject for this attribute would not be worth the increased memory footprint of all PyObjects.
What is frequently done in Jython in such cases, is to maintain a HashMap that maps PyObjects to this attribute. Of course this shall not prevent gc from collecting the mapped PyObjects, so this is usually implemented as a weak hashMap.
However, this hashMap breaks if a PyObject is resurrected in its finalizer (Please no "Resurrection is evil anyway"-comments, I did not invent that Python allows this, but we have to deal with it!).

Examples are
- ids as returned by Py.id()
- List of WeakReferences to a PyObject (in weakref module)
- handles to native objects in JyNI
(I'm sure there are more...)

This is an inherent problem, as weak hashMaps and resurrection simply don't go together. And there is no easy work-around.

So I propose to refine this common pattern and instead add a general purpose (linked) list of attributes to PyObject.
Supported attributes get reserved id-fields, ordering them by priority and allowing a fail-fast lookup. I propose a linked list for this, because there will be only few attributes per object and typically, they will be accessed seldomly. So a linked list is a sufficient data-structure with minimal overhead.

In addition to the above mentioned examples, also FinalizeTriggers can be stored in this list, amortizing the memory of the additional field for all PyInstance objects and their new style equivalents (as the  finalizeTrigger-field becomes obsolete). If memory matters, one can also include javaProxy as always-on-top element. Then there is not even need for an additional field in PyObject as the javaProxy field can be refined to be the top attribute, which would hardly increase lookup time for this field.

Benefits:
- solves the breaking attributes problem in resurrection case
- would make weak hash maps obsolete in several places
  - saves memory and lookup-time
  - no more threads like the refReaper thread in the weakref module
- simplifies implementation of Finalizable PyObjects
- simplifies implementation wherever one needs to attach attributes to PyObjects


Note: In the attached draft-implementation, I included javaProxy as possible attribute, but this is of course not strictly necessary - lets decide this together.

Note2: I selected urgent severity since it is a major refinement to the internal API, so I recommend this to be done before 2.7.0 is released.
msg9165 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-23.22:21:48
I pinned an id-related example and a weakref example demonstrating the issue in a unit test. The appended test runs fine in CPython while both tests fail in Jython (cloned from trunk on 22.10.).
msg9166 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-23.22:27:40
Maybe I should clarify why I have high interest in safe and clean resurrection handling. This is needed for proper native gc emulation in JyNI. I won't go into details here, but I can't implement gc support in JyNI while these bugs persist.
msg9168 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-25.21:29:17
I implemented the described solution in

https://bitbucket.org/stewori/jynithon/commits/3e69cdc11ec7f94c506b1cfcffbbdaabd7bb60fe

and it works beautifully. However, some of the promised benefits don't applY :(


- no more threads like the refReaper thread in the weakref module
The thread is still necessary, because it triggers callbacks. When I mentioned its obsoleteness, I had in mind it would only do clean-up.

- would make weak hash maps obsolete in several places
The weak hash map in IdImpl is still needed as it assures the id depends on the javaProxy. Indeed, PyObjects with same javaProxys shall be considered equal (c.f. PyObject._is) and have equal ids.

I noticed that CPython (tested with 2.7.5) sometimes also changes ids on resurrection and also breaks weakref.getweakrefs(). Both is the case, if cyclic garbage was collected; in contrast id and getweakrefs are preserved, if garbage is cleared by ref-counts dropping to 0.

I would consider this a bug in CPython and still think it should be fixed the way I proposed.
msg9169 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-26.00:56:37
Okay, the mentioned behavior is no bug in CPython. Since its gc does not call finalizers of objects in ref-cycles, there can't happen resurrection. My test then accidently used the last resurrected object and took it for the one I expected in cyclic case.
With all this fixed, CPython behaves fine.

Further, my implemented solution appears to fail some regr tests. I think the issue is related to serialization support. I hope to solve it by allowing certain attributes to be transient. Soon more on this.
msg9174 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-28.03:40:47
regrtests are fixed now:
https://bitbucket.org/stewori/jynithon/commits/9405afa1c1c9896fb4e19883cf195c94dd61b8be

(The 26 currently failing tests also failed before the patch and appear not to be related) I noticed I forgot to add two files to hg in the former changeset. So it would not have been usable at all. This is also fixed now.

The current solution preserves the result of weakref.getweakrefs(...) in resurrection cases, even in a resurrection case where CPython does not. However I see no reason to clear this list artificially as long as the referent is actually alive.
So there is a test in test_resurrection_attr_preserve.py that CPython fails: test_weakref_after_self_resurrection

Keep in mind that this is about getting the referrers, not the referent. The referent is still not preserved and doing so would be much harder and potentially expensive.
msg9175 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-28.15:09:35
I realized that is is inconsistent to preserve the referrer's list, but not the referents. The list would then list references that are actually invalid. So I reverted GlobalRef back to original state for now:

https://bitbucket.org/stewori/jynithon/commits/d8c610a8b92dc74cc284f5e1921f411bc42f283d

However I have plans to support preserving the referents too. At least optionally by a gc flag since it will be probably a little expensive, as gc would actively have to check for resurrection and possibly restore weakrefs. For this future work, GlobalRef will need the attribute-variant again, so I did not remove the corresponding id in JyAttribute, but marked it as a future feature.
msg9176 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2014-10-28.15:54:30
BTW, the new behavior is tested in
Lib/test/test_resurrection_attr_preserve.py
msg9513 (view) Author: Stefan Richthofer (stefan.richthofer) Date: 2015-02-11.04:57:40
Was fixed with commit 3536cd7b2657.
Can be closed.
History
Date User Action Args
2015-02-11 17:09:33zyasoftsetresolution: fixed
2015-02-11 17:09:25zyasoftsetstatus: open -> closed
2015-02-11 04:57:41stefan.richthofersetmessages: + msg9513
2014-10-28 15:54:30stefan.richthofersetmessages: + msg9176
2014-10-28 15:09:35stefan.richthofersetmessages: + msg9175
2014-10-28 03:40:48stefan.richthofersetmessages: + msg9174
2014-10-26 00:56:38stefan.richthofersetmessages: + msg9169
2014-10-25 21:29:18stefan.richthofersetmessages: + msg9168
2014-10-23 22:27:40stefan.richthofersetmessages: + msg9166
2014-10-23 22:21:49stefan.richthofersetfiles: + test_Jyattr.py
messages: + msg9165
2014-10-23 16:31:53stefan.richthofercreate