[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4390) slapd-ldap crashes with failed assertion



On Mon, 2006-02-06 at 21:22 +0000, ando@sys-net.it wrote:

> > One thing I've found very interesting and revealing is that right before
> > the crash there was an abandon, which could explain why lc_refcnt was
> > not zero.  I'll try to investigate how and when the abandon was issued,
> > to see if it can be reproduced.
> 
> The abandon appears to be issued by the client; it refers to msgid 12,
> which is not present in the logs so we have no idea of what it refers
> to.  One point is that it looks like the client ran multiple concurrent
> operations on the same connection; at some point it might have issued an
> abandon on an early operation (msgid=12 as opposed to msgid=135 of the
> latest) and shut down the connection abruptly, while other operations
> were still running.  Slapd correctly closes the connection, but other
> operations are still using the cached connection and thus lc_refcnt is
> not 0.  Slapd should likely defer connection shutdown until all
> operations in that connection either complete or acknowledge the
> abandon, and in any case they release the cached connection.

... however, the proxy appears to behave correctly when a client issues
an unbind, either preceded or not by an abandon, while multiple
operations on that connection are yet to be concluded; for example:

conn=6 fd=9 ACCEPT from IP=127.0.0.1:52041 (IP=0.0.0.0:9013)
conn=7 fd=10 ACCEPT from IP=127.0.0.1:52042 (IP=0.0.0.0:9013)
conn=6 op=0 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=6 op=1 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=6 op=2 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=6 op=3 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=7 op=0 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=7 op=1 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=7 op=2 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
conn=7 op=3 SRCH base="cn=Sleep,ou=Retcodes,o=example,c=us" scope=2
deref=0 filter="(objectClass=*)"
abandoned ld 0x96850c8 msgid 34
abandoned ld 0x96850c8 msgid 35
abandoned ld 0x96850c8 msgid 36
abandoned ld 0x96850c8 msgid 37
abandoned ld 0x96850c8 msgid 38
abandoned ld 0x96850c8 msgid 39
abandoned ld 0x96850c8 msgid 40
abandoned ld 0x96850c8 msgid 41
conn=7 op=4 UNBIND
conn=6 op=4 ABANDON msg=1
conn=6 op=5 UNBIND
conn=7 fd=10 closed
conn=6 fd=9 closed

The remote server uses an instance of the retcode overlay, so that each
of the searches sleeps 2 seconds before returning a result.  This
ensures that all the (asynchronous) operations are pending when the
unbind occurs.

p.




Ing. Pierangelo Masarati
Responsabile Open Solution
OpenLDAP Core Team

SysNet s.n.c.
Via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
------------------------------------------
Office:   +39.02.23998309          
Mobile:   +39.333.4963172
Email:    pierangelo.masarati@sys-net.it
------------------------------------------