[Date Prev][Date Next] [Chronological] [Thread] [Top]

Replicants get out of sync, how to track down problem



I've got OpenLDAP-2.1.22 running on six machines, one is the master
which replicates to the other five. They're all on sparc/Solaris.
(The five machines query their own read-only replicas for mail
delivery, freeing the master to take updates)

Twice in the past 8 months or so, one of the replicants has gotten out
of sync from the master; the other replicants were updated fine.  This
may have been due to a large number of updates going into the master
at one time, but the replicants shouldn't get out of sync from this.
The slurpd.status file had something like this:

  192.168.228.18:389:1098302108:0
  192.168.228.20:389:1098302108:1
  192.168.228.21:389:1098302108:0
  192.168.228.23:389:1098302108:0
  192.168.228.24:389:1098302108:0

with the failing replica having a non-zero final field. I need to find
out why this occurred but don't know where to begin searching.  Can
anyone provide pointers?

Thanks.