[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3671) disconnecting a syncrepl consumer deadlocks the provider



On Wednesday 20 April 2005 18:17, Howard Chu wrote:
> rhafer@suse.de wrote:
> >Full_Name: Ralf Haferkamp
> >Version: HEAD
> >OS: Linux (Kernel 2.6)
> >URL: ftp://ftp.openldap.org/incoming/
> >Submission from: (NULL) (212.95.106.108)
> >
> >
> >While experimenting a bit with the syncprov overlay of current CVS
> > HEAD I ran across this issue. I configured the consumer as
> > "refreshAndPersist". While adding some entries to the provider I
> > pulled the network plug from the machine running consumer slapd.
> > After a short while the provider is completely locked it doesn't
> > answer any requests anymore.
>
> Hm, that's annoying. Can you get a gdb backtrace of this situation?
I haven't been able to get a useful backtrace yet. (Seems that either my 
gdb or my setup is broken here :( ) But I'll try that again.

> At a guess, some number of threads are tied up waiting for the
> syncprov modify lock, and syncprov is tied up waiting for TCP to get
> an ACK from one of its sent responses. 
This seems to be correct. I could at least see that syncprov is waiting 
in the ldap_pvt_thread_cond_wait() in slapd/servers/result.c

> The situation should free 
> itself up when the TCP timeout expires on the connection. 
Yes, I guess so.

> The only 
> way to avoid the lockup I suppose is to dup the target entry and
> release the modify lock on it before attempting to send the reply.
> Much more memory intensive, but it may be the only way out. And some
> number of threads will still wind up getting blocked trying to send
> their replies.

-- 
Ralf