Issue2517

classification
Title: Showers of stack dumps from netty finalization
Type: crash Severity: normal
Components: Versions: Jython 2.7
Milestone:
process
Status: open Resolution: accepted
Dependencies: Superseder:
Assigned To: jeff.allen Nosy List: amak, jeff.allen, zyasoft
Priority: high Keywords: test failure causes

Created on 2016-08-31.06:51:17 by jeff.allen, last changed 2017-03-18.15:07:24 by jeff.allen.

Messages
msg10924 (view) Author: Jeff Allen (jeff.allen) Date: 2016-08-31.06:51:15
Observed on Windows regrtest, I get a lot of these (below) after test_socket_jy. It's not actually a test failure, but seems to indicate we have something wrong.

    [exec] Aug 31, 2016 7:27:46 AM io.netty.util.concurrent.DefaultPromise safeExecute
     [exec] SEVERE: Failed to submit a listener notification task. Event loop shut down?
     [exec] java.util.concurrent.RejectedExecutionException: event executor terminated
     [exec]     at io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:840)
     [exec]     at io.netty.util.concurrent.SingleThreadEventExecutor.offerTask(SingleThreadEventExecutor.java:342)
     [exec]     at io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:335)
     [exec]     at io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:765)
     [exec]     at io.netty.util.concurrent.DefaultPromise.safeExecute(DefaultPromise.java:767)
     [exec]     at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:435)
     [exec]     at io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:170)
     [exec]     at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)
     [exec]     at io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)
     [exec]     at sun.reflect.GeneratedMethodAccessor332.invoke(Unknown Source)
     [exec]     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     [exec]     at java.lang.reflect.Method.invoke(Method.java:606)
     [exec]     at org.python.core.PyReflectedFunction.__call__(PyReflectedFunction.java:186)
     [exec]     at org.python.core.PyReflectedFunction.__call__(PyReflectedFunction.java:204)
     [exec]     at org.python.core.PyObject.__call__(PyObject.java:497)
     [exec]     at org.python.core.PyObject.__call__(PyObject.java:501)
     [exec]     at org.python.core.PyMethod.__call__(PyMethod.java:141)
     [exec]     at _socket$py.close$89(C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py:1092)
     [exec]     at _socket$py.call_function(C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py)
     [exec]     at org.python.core.PyTableCode.call(PyTableCode.java:167)
     [exec]     at org.python.core.PyBaseCode.call(PyBaseCode.java:139)
     [exec]     at org.python.core.PyFunction.__call__(PyFunction.java:413)
     [exec]     at org.python.core.PyMethod.__call__(PyMethod.java:126)
     [exec]     at _socket$py.close$127(C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py:1501)
     [exec]     at _socket$py.call_function(C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py)
     [exec]     at org.python.core.PyTableCode.call(PyTableCode.java:167)
     [exec]     at org.python.core.PyBaseCode.call(PyBaseCode.java:139)
     [exec]     at org.python.core.PyFunction.__call__(PyFunction.java:413)
     [exec]     at org.python.core.PyMethod.__call__(PyMethod.java:126)
     [exec]     at _socket$py.__del__$129(C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py:1514)
     [exec]     at _socket$py.call_function(C:\Users\Jeff\Documents\Eclipse\jython-trunk\dist\Lib\_socket.py)
     [exec]     at org.python.core.PyTableCode.call(PyTableCode.java:167)
     [exec]     at org.python.core.PyBaseCode.call(PyBaseCode.java:139)
     [exec]     at org.python.core.PyFunction.__call__(PyFunction.java:413)
     [exec]     at org.python.core.PyMethod.__call__(PyMethod.java:126)
     [exec]     at org.python.core.PyMethod.__call__(PyMethod.java:117)
     [exec]     at org.python.core.PyObjectDerived.__del_derived__(PyObjectDerived.java:24)
     [exec]     at org.python.core.finalization.FinalizeTrigger.runFinalizer(FinalizeTrigger.java:82)
     [exec]     at org.python.core.finalization.FinalizeTrigger.performFinalization(FinalizeTrigger.java:168)
     [exec]     at org.python.core.finalization.FinalizeTrigger.finalize(FinalizeTrigger.java:207)
     [exec]     at java.lang.System$2.invokeFinalize(System.java:1210)
     [exec]     at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98)
     [exec]     at java.lang.ref.Finalizer.access$100(Finalizer.java:34)
     [exec]     at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210)
msg10926 (view) Author: Jim Baker (zyasoft) Date: 2016-09-02.18:25:18
This is a side effect of our upgrade to Netty 4.1 - before Netty dropped these notifications.

We just need to setup the appropriate listener to quiet. To be researched what that is, but we should get this done for the RC - random stuff to stderr has impact on usage.
msg10928 (view) Author: Jim Baker (zyasoft) Date: 2016-09-02.20:24:46
The relevant Netty change is https://github.com/netty/netty/issues/2166, which happened in 4.0.16; but most likely 4.1 made this more prominent. My recollection is that I don't believe it's possible to have Python socket closing semantics without attempting to close the pool as soon as possible - as it is, we wait up to 100ms. See the _realsocket._finish_closing method in Lib/socket.py for more on what is being done to close the socket.

Therefore we need to take advantage of the rejected logger, which is under the name io.netty.util.concurrent.DefaultPromise.rejectedExecution; see https://github.com/netty/netty/commit/2ba00cd358de2f4fbfef12b0be63605d49826f1a

Assuming SL4J + Logback is still being used, this advice on StackOverflow should be relevant: http://stackoverflow.com/a/11852808/423006. In particular we can put a suitably configured logback.xml in jython.jar or jython-dev.jar to remove this logging: http://logback.qos.ch/manual/configuration.html#auto_configuration
msg10941 (view) Author: Jim Baker (zyasoft) Date: 2016-09-06.05:21:57
I verified by looking at the source of Netty that logback is used. Now it's just a question of implementing the appropriate silencing.
msg11230 (view) Author: Jeff Allen (jeff.allen) Date: 2017-03-15.09:09:36
I'll take a look at this. Following Jim's advice (silence the messages) as a pragmatic answer, but also on the lookout for a root cause. It seems to be that we are giving work to a worker we already retired, but quite why I can't see.
msg11240 (view) Author: Jeff Allen (jeff.allen) Date: 2017-03-18.15:07:23
Not logback (which seems an excellent library BTW) but plain ol' java.util.logging. Netty adapts to what's available https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/internal/logging/InternalLoggerFactory.java#L19 .

I'm going to quiet the stack dumps I observe in test_socket only, treating them for now as a side effect of the way we keep making and tearing things up when testing. There are two loggers: one producing at SEVERE, the other at WARNING, which I'll set to OFF and SEVERE respectively.

I suspect there are real causes here we could fix, along with the hang I regularly experience, but I'd need to know more about Netty and sockets -- which I'd like to.
History
Date User Action Args
2017-03-18 15:07:24jeff.allensetmessages: + msg11240
2017-03-15 09:09:36jeff.allensetassignee: jeff.allen
messages: + msg11230
2017-02-27 04:45:33zyasoftsetpriority: urgent -> high
assignee: zyasoft -> (no value)
2016-09-06 05:21:58zyasoftsetmessages: + msg10941
2016-09-02 20:24:47zyasoftsetmessages: + msg10928
2016-09-02 18:25:18zyasoftsetpriority: normal -> urgent
assignee: zyasoft
resolution: accepted
messages: + msg10926
nosy: + zyasoft
2016-09-02 16:11:31amaksetnosy: + amak
2016-08-31 06:51:17jeff.allencreate