Issue1974
Created on 2012-10-02.17:09:22 by mishikal@yahoo.com, last changed 2013-02-19.23:13:30 by fwierzbicki.
| Messages | |||
|---|---|---|---|
| msg7469 (view) | Author: Quanah Gibson-Mount (mishikal@yahoo.com) | Date: 2012-10-02.17:09:21 | |
I recently upgraded from jython 2.5.2 to jython 2.5.3 standalone jars.
After this upgrade, we found that threading is broken in 2.5.3, when a
process tries to join an existing thread.
Logs from 2.5.2:
Oct 1 15:18:27 zre-ldap002 zmconfigd[28096]: Found 3 threads
Oct 1 15:18:27 zre-ldap002 zmconfigd[28096]: Active Thread listener found
Oct 1 15:18:27 zre-ldap002 zmconfigd[28096]: Active Thread Thread found
Oct 1 15:18:27 zre-ldap002 zmconfigd[28096]: Active Thread MainThread found
Oct 1 15:18:27 zre-ldap002 zmconfigd[28096]: Fetching All configs
Logs from 2.5.3:
Oct 1 15:04:06 zre-ldap002 zmconfigd[23938]: Found 3 threads
Oct 1 15:04:06 zre-ldap002 zmconfigd[23938]: Active Thread listener found
Oct 1 15:04:06 zre-ldap002 zmconfigd[23938]: Active Thread Thread-8329
found
Oct 1 15:04:06 zre-ldap002 zmconfigd[23938]: Attempting to join()
Thread-8329
Oct 1 15:04:11 zre-ldap002 zmconfigd[23938]: join() Thread-8329 FAILED
Oct 1 15:04:11 zre-ldap002 zmconfigd[23938]: Hung threads detected (3
total), exiting
This is infinitely repeatable.
Our loop has:
while True:
Log.logMsg (4, "Found %d threads" % threading.activeCount())
for th in threading.enumerate():
Log.logMsg(4, "Active Thread %s found" % th.getName())
if (th.getName() != "listener" and th.getName() != "MainThread" and th.getName() != "Thread" and th.getName != "SIGUSR2 handler"):
Log.logMsg(4, "Attempting to join() %s" % th.getName())
th.join(5);
if (th.isAlive()):
Log.logMsg(1, "join() %s FAILED" % th.getName())
Log.logMsg(1, "Hung threads detected (%d total), exiting" % threading.activeCount());
sys.exit(1)
t1 = time.clock()
try:
# read all the configs
myState.getAllConfigs(myConfig)
except Exception, e:
[Log.logMsg(1,t) for t in traceback.format_tb(sys.exc_info()[2])]
if myState.forced:
Log.logMsg(0, "Key lookup failed.")
Log.logMsg(1, "Sleeping...Key lookup failed (%s)" % (e,))
time.sleep(60)
continue
try:
# read zmconfigd config
myState.getMtaConfig(myConfig.configFile)
# watchdog restarts apps if they are not running
watchdog()
# check for config changes
myState.compareKeys()
Log.logMsg (5, "LOCK myState.lAction requested")
myState.lAction.acquire()
Log.logMsg (5, "LOCK myState.lAction acquired")
myState.compileActions()
myState.requestedconfig = {}
myState.doConfigRewrites()
myState.lAction.notifyAll()
myState.lAction.release()
Log.logMsg (5, "LOCK myState.lAction released")
# executes rewrites/postconf/restarts
if myConfig.restartconfig:
myState.doRestarts()
except Exception, e:
[Log.logMsg(1,t) for t in traceback.format_tb(sys.exc_info()[2])]
if myState.forced and myState.forced < 100:
Log.logMsg(0, "Configuration inconsistency detected (%s)" % (e,))
Log.logMsg(1, "Sleeping...Configuration inconsistency detected (%s)" % (e,));
time.sleep(60)
continue
if myState.forced:
break
# start the listener after we have the lock, or an early request can cause problems
# start the listener after the rewrites, so the start script doesn't return before they're complete
if myState.firstRun and not myState.forced:
requestListener()
Log.logMsg (5, "LOCK myState.lAction released")
myState.firstRun = False
lt = time.clock()-t1
Log.logMsg(4, "Loop completed in %.2f seconds" % (lt,));
Log.logMsg(4, "Sleeping for %d." % (myConfig.interval,));
# Jython won't wake up from time.sleep() when a signal is received and caught.
# Uncaught signals seem to cause the JVM to exit
myState.sleepTimer = myConfig.interval
Log.logMsg (5, "Sleeping for %d" % myState.sleepTimer)
while myState.sleepTimer > 0:
time.sleep(sleepinterval)
myState.sleepTimer -= sleepinterval
Log.logMsg (5, "Waking up")
dt = time.clock()-t0
Log.logMsg(5, "%s completed in %.2f seconds" % (myConfig.progname,dt));
sys.exit(0)
|
|||
| msg7620 (view) | Author: Frank Wierzbicki (fwierzbicki) | Date: 2013-02-05.18:46:28 | |
Hi, I can't currently reproduce this since your "Log" object and "myState" (and maybe others) are not defined in this script. If you could supply the missing parts I would have a better chance of reproducing your issue and looking into it. |
|||
| msg7625 (view) | Author: Quanah Gibson-Mount (mishikal@yahoo.com) | Date: 2013-02-05.23:11:06 | |
You can get the bits we import from: https://github.com/Zimbra-Community/zimbra-sources/tree/master/main/ZimbraServer/src/python/pylibs And a current copy of zmconfigd from: https://github.com/Zimbra-Community/zimbra-sources/blob/master/main/ZimbraServer/src/libexec/zmconfigd |
|||
| msg7665 (view) | Author: Frank Wierzbicki (fwierzbicki) | Date: 2013-02-11.19:57:36 | |
Hi Quanah, Debugging threading issues with large dependencies is a bit tough, is there any way that you could narrow the bug down to a self contained smallish code sample so we can narrow this down? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2013-02-19 23:13:30 | fwierzbicki | set | resolution: remind versions: + Jython 2.5, - 2.5.3b2 |
| 2013-02-11 19:57:36 | fwierzbicki | set | priority: normal assignee: fwierzbicki messages: + msg7665 |
| 2013-02-05 23:11:06 | mishikal@yahoo.com | set | messages: + msg7625 |
| 2013-02-05 18:46:28 | fwierzbicki | set | nosy:
+ fwierzbicki messages: + msg7620 |
| 2012-10-02 17:09:22 | mishikal@yahoo.com | create | |