[Date Prev][Date Next] [Chronological] [Thread] [Top]

slapd not cluster aware


I'm running slapd (Openldap 2.0.23) on a system (true64) with two clusters.
Generally it works fine, but if a cluster with a running slapd shuts down,
the daemon doesn't restart correctly on the other cluster like it should
be - it takes a quite long timeout until starting again.

As we found out there are still open sockets with status "TIME_WAIT" from
threads of the (closed) slapd instance. Until these sockets are closed from
host side or have reached their timeout, slapd cannot restart.

But why are these sockets still open if slapd is killed in _right matter_?
How can I make sure that slapd kills all connections when it shuts down?
Any ideas?


PS. I asked a similar question on this list some time ago and the answer
from Thomas Nau led me to find where the problem lies, but I still don't
know to fix it. I already tried to set "SO_REUSEADDR=1", but this didn't
work. So I hope anybody can give me some advise.

Here is what Thomas wrote:
>seems that the socket is in TIME_WAIT state which doesn't allow you to use
the IP/port
>pair for a while (details in Stevens TCP/IP Vol 1). If you check the
>sources (line 473, openldap-2.0.23/servers/slapd/daemon.c) you will see
>            /* enable address reuse */
>            tmp = 1;
>            rc = setsockopt( l.sl_sd, SOL_SOCKET, SO_REUSEADDR,
>                     (char *) &tmp, sizeof(tmp) );
>            ...