[Date Prev][Date Next]
Re: (ITS#3671) disconnecting a syncrepl consumer deadlocks the provider
On Wednesday 20 April 2005 18:17, Howard Chu wrote:
> email@example.com wrote:
> >Full_Name: Ralf Haferkamp
> >Version: HEAD
> >OS: Linux (Kernel 2.6)
> >URL: ftp://ftp.openldap.org/incoming/
> >Submission from: (NULL) (18.104.22.168)
> >While experimenting a bit with the syncprov overlay of current CVS
> > HEAD I ran across this issue. I configured the consumer as
> > "refreshAndPersist". While adding some entries to the provider I
> > pulled the network plug from the machine running consumer slapd.
> > After a short while the provider is completely locked it doesn't
> > answer any requests anymore.
> Hm, that's annoying. Can you get a gdb backtrace of this situation?
I haven't been able to get a useful backtrace yet. (Seems that either my
gdb or my setup is broken here :( ) But I'll try that again.
> At a guess, some number of threads are tied up waiting for the
> syncprov modify lock, and syncprov is tied up waiting for TCP to get
> an ACK from one of its sent responses.
This seems to be correct. I could at least see that syncprov is waiting
in the ldap_pvt_thread_cond_wait() in slapd/servers/result.c
> The situation should free
> itself up when the TCP timeout expires on the connection.
Yes, I guess so.
> The only
> way to avoid the lockup I suppose is to dup the target entry and
> release the modify lock on it before attempting to send the reply.
> Much more memory intensive, but it may be the only way out. And some
> number of threads will still wind up getting blocked trying to send
> their replies.