[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Strange hang scenario, resumes after idletimeout, but plenty of FDs available



Kartik Subbarao wrote:
On 06/01/2011 02:02 PM, Howard Chu wrote:
Kartik Subbarao wrote:
On 06/01/2011 09:08 AM, Kartik Subbarao wrote:
Update: It's not a locks issue.

Another update -- after I reduced the idletimeout to 60 seconds, the
problem seems to have gone away. It would still be useful to know what
might be causing this problem and to be able to support higher levels of
idletimeout, but at least I have another workable option now.

As usual, when investigating a hang, you should attach to slapd with gdb
and get a snapshot of what all the threads are doing. Working around it
before you know what caused it isn't all that useful in the long run.

Attached is the gdb stack trace from the hang state. It looks like the
the threads are stuck in pthread_cond_wait() from send_ldap_ber(). Are
there other relevant variables/structures to inspect for this scenario?

	-Kartik

Also get a netstat -nA inet. Threads waiting in send_ldap_ber() means their output buffers got full, clients didn't read the pending data.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/