[Date Prev][Date Next]
Re: 2.3.39 syncrepl lost connection
The syncrepl config on the replicas is for refreshAndPersist and does a retry
every 30 seconds -- so, if the replica knew the connection had dropped, it
should have restarted it.
I had this ~2.3.27. It should be OK now; there was a patch to libldap.
A 2.3.39 replica should know the connection was lost, via the underlying
OS, because it requests SO_KEEPALIVE. I assume these are
(pseudo-?)dedicated servers, given the size of your OpenLDAP installation.
As such, you may want to investigate your kernel tunable parameters to
make keepalives more aggressive.
You mention RHEL. If you have a test environment, iptables'ing ports
389/636 to /dev/null is a really good way to make sure that this works in
a sane fashion. Around here, we don't pay by the byte, so we do:
net.ipv4.tcp_keepalive_time = 10
net.ipv4.tcp_keepalive_intvl = 5
net.ipv4.tcp_keepalive_probes = 5
to keep syncrepl sync'd. With the CentOS defaults, we saw ~2+hrs before
the kernel got a clue.