[Date Prev][Date Next] [Chronological] [Thread] [Top]

ITS#3850 hang after deferring : binding



I'm having trouble setting up a test environment to reproduce this 
problem. With a test program that simply binds and unbinds in a tight 
loop, I run out of available sockets after about 60,000 connections. 
Once I run out of sockets all of the test programs exit... And of 
course, no problem manifests while Binding/unbinding for those 60,000 
attempts.

At any rate, the log you've provided shows that some number of Bind 
operations didn't send a result to the client. Unfortunately, I find 
this log suspect because you used syslog instead of an actual debug log 
from stderr, and syslog will drop records when it gets too busy. As a 
general rule, syslog output is useless for debugging; you have to use 
the output from "slapd -d" on stderr when tracing problems.

It is of course possible that the Bind operations are waiting for 
something else to complete before sending their own result back, and so 
the particular connection really does have an outstanding Bind request 
still on it.

I also note that you're using back-ldbm, which has extremely poor 
locking behavior, and this may be a factor in the Binds failing to 
complete. You should try testing again using back-bdb or back-hdb. If 
you can reproduce the problem, you should attach to the slapd with gdb 
and get a stack trace of all the running threads, so we can see what 
slapd is doing when the hangs occur.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/