[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#4493) server hangs because connections are not closed



Full_Name: Andreas Mueller
Version: 2.3.20
OS: Solaris
URL: 
Submission from: (NULL) (193.134.170.35)


We are using slapd as a satellite LDAP server to offload some of the requests
from the main corporate directory. Furthermore, other slapd instances are
running on every Solaris machine as a local cache for LDAP data. This slapd
client instances use syncrepl to update themselves from the central slapd.

After a few days, the central slapd becomes unresponsive. It turns out
that it has used up all file descriptors, but most of them are sockets
in state CLOSE_WAIT. Because of the per process open file limit, slapd
cannot accept(2) any new connections, so the listen queue quickly fills
up and the server starts to refuse connections.

Since CLOSE_WAIT means that the client has closed his side of the connection,
the next action we expect is that the slapd should close his side. This
obviously does not happen. We then restart the server (stop/start), which
of course fixes the problem, at least for a few days.

We have this problem in 2.3.12, 2.3.13 and 2.3.20, in fact the upgrades
happened in the hope that they would fix the problem. Unfortunately,
we cannot tell whether the connections are not properly closed for
syncrepl clients or for all the others.