[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5354) slapd repeatedly hangs and stops reponding




Russ Allbery wrote:
> Oren Laadan <orenl@cs.columbia.edu> writes:
>> Russ Allbery wrote:
>>> orenl@cs.columbia.edu writes:
> 
>>>> Our configuration uses both BDB and META back-ends; as it turns out
>>>> the standard debian package of LDAP fails to run with the META
>>>> back-end configured (see also complaints here:
>>>> http://www.openldap.org/lists/openldap-bugs/200705/msg00011.html and
>>>> here: http://arkiv.netbsd.se/?ml=OpenLDAP-bugs&a=2007-02&t=3076794).
> 
>>> Yeah, we're still trying to figure out the best way to fix that.
>>> libtool has some serious problems with how it handles namespaces when
>>> dynamically loading modules, and fixing one set of problems creates a
>>> separate set of problems.  (This particular problem is Debian-specific
>>> and not a problem with the upstream OpenLDAP.)
>> :(
>>
>> Well, the good news is that it isn't a show-stopper, and the simple
>> workaround is to compile the code from scratch.
> 
> Specifically, the workaround is to compile the code with the upstream
> libtool instead of with Debian's libtool, since upstream libtool imports
> all module symbols into the global namespace.  Debian's libtool has been
> modified to not do this because it causes all sorts of other problems in
> the general case, but not doing this breaks the meta backend because it
> wants to reference symbols from the bdb backend.
> 
> Steve had an ugly hack to work around this by linking the meta backend
> against the bdb backend.  Doing that isn't the ugly part -- that's
> actually formally correct and really is what libtool should be doing in
> the first place.  The ugly part is that libtool *really* doesn't like
> linking against something that doesn't start with lib*.
> 
>> What's really bothering me, is that the system repeatedly and frequently
>> hangs with my configuration ... (my users aren't happy with me
>> recently).  Do you have any idea how to investigate that ?
> 
> Not beyond the standard advice to investigate what the server is doing
> during a hang using gdb attach or strace.

done that. couldn't find anything interesting: some threads where
waiting in some posix-threads related place (most of them were in
thread-join if I recall correctly).

as I'm not familiar with ldap implementation, was hard to just dive
in without prior knowledge.

note that the problems happens becomes more acute when I set the
"idle_timeout" to 30 seconds; the system is much more tolerant when
it isn't set.

> 
> We have yet to have a chance to do intensive testing of 2.4.7.  (Debian
> testing really is a testing distribution that doesn't have production
> stability.)
> 

I observed the exact same behavior on 2.3.9 (latest stable) that I
compiled from scratch; this is actually the reason why I moved to
2.4.7 (hoped it would be better).

Is there someone who can work with me on getting to the root cause
and hopefully getting it fixed ?

Oren.