[Date Prev][Date Next]
We've noticed hard failures on both our Linux and Mac workstations
when an LDAP server fails in a way which causes it to stop responding
but leave a connection open (e.g. lock contention, disk failure). This
usually ends up requiring the system to be rebooted because a key
system process will probably have made a call which is waiting on a
read() which might take days to fail.
I've created a patch simply calls setsockopt() to set SO_SNDTIMEO|
SO_RCVTIMEO when LDAP_OPT_NETWORK_TIMEOUT has been set. This appears
to produce the desired result on Linux (both with pam_ldap and the
ldap utilities) and OS X (within the DirectoryService plugin).
Is there a drawback to this approach which I've missed? It appears
that the issue has come up in the past but there's no solution that I
can see (certainly nothing else uses socket-level timeouts). I'd like
to find a solution for this as it's by far the biggest source of Linux
downtime in our environment.
Description: Binary data
Description: S/MIME cryptographic signature