[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4429) (back-ldap?) slapd deadlock



On Tue, 2006-03-28 at 22:03 +0000, richton@nbcs.rutgers.edu wrote:
> > What is the TLS-related configuration of back-ldap, and what is the
> > exact sequence of operations you make?
> 
> database ldap
> suffix "dc=rutgers,dc=edu"
> uri ldap://ldap.nbcs.rutgers.edu
> tls propagate
> conn-ttl 30
> 
> This is highly intermittent. My conn-ttl was 30 for quick reproduction and
> there's a watchdog set to 60sec doing SEARCHes, and it doesn't
> consistently fail. The log I have with shortest time to failure had err=52
> on conn=12; I've got other logs with conn=22, 24, and 51. So running it
> twice is unlikely to trigger; I'd sooner wonder about setting up a
> test036-style beating?

I've found two separate issues, both related to concurrency; this could
explain the intermittency of the abnormal behavior you've been
observing.

First of all, ldap_back_retry() was not sending any result to the client
in those cases it couldn't retry because lc_refcnt > 1.  This is now
fixed.

Second, even after fixing this we would get into a chicken-and-egg
condition because ldap_back_getconn() would lookup an expired connection
to find out it's expired __after__ incresing lc_refcnt and thus being
unable to retry it.  I've reworked the code so that as soon as
ldap_back_getconn() finds out that a connection expired, it uses it
after removing it from the avl tree, so that nobody can look it up any
further, and the next request will create a new one.

Please test the fix (it's all in back-ldap/bind.c).

p.




Ing. Pierangelo Masarati
Responsabile Open Solution
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------