[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: multi-master syncrepl issue



Hi All,

In addition to the questions below (which I'd still like to see an answer to), I have a further related question.

I now have 4 machines set up for multi-master replication of this directory, and the contextCSNs were all in sync last week.
One of the machines was taken down for maintenance for a couple of days and has only just come back up, so its contextCSN is way behind
(20120818093702.01462Z#000000#001#000000 compared to 20120821130410.679339Z#000000#001#000000 as of now). 
This machine has now been up and running for a few hours, but there's no sign of replication catching up.
Should ldap replication sort itself out automatically and bring this machine's directory in line? Or do I need to do something manually to force it?

Chris

From: ctcard@hotmail.com
To: openldap-technical@openldap.org
Subject: multi-master syncrepl issue
Date: Mon, 6 Aug 2012 11:53:53 +0000

Hi All,

I have a multi-master openldap setup with 2 machines replicating a directory containing about 3.5 million entries.

I'm running openldap 2.4.31 on centos 6, and the directory is using the BDB backend.

Although the 2 machines are configured for multi-master syncrepl replication, in practice data is only written to one of the machines (I'll call it the master), 
and the second machine (which I'll call the slave) only gets data written by openldap replication.

Currently the contextCSN of the directory is the same from both machines, which (as I understand it) should mean that the directories are in sync, but I have written a program 
to compare what is in both directories which finds that there are 16 entries in the master directory not in the slave directory. I have double checked this
using ldapsearch on both directories.

I can't see any error messages in the openldap log and there doesn't appear to be any pattern connecting the entries which are missing from the slave. Most of the missing entries were
in the master directory before I created the slave machine and configured replication and have not changed.

The syncrepl config looks like this:

dn: olcDatabase={1}bdb,cn=config
olcSyncrepl: {0}rid=101 provider="ldap://<master>:389" binddn="<binddn>" bindmethod=simple credentials=<bindpw> searchbase="<prefix>" type=refreshAndPersist retry="5 5 300 5" timeout=1
olcSyncrepl: {1}rid=110 provider="ldap://<slave>:389" binddn="<binddn>" bindmethod=simple credentials=<bindpw> searchbase="<prefix>" type=refreshAndPersist retry="5 5 300 5" timeout=1

Are there any known issues with openldap replication which could result in missing data?

How can I force these missing entries to appear in the slave without rebuilding the whole of the slave directory and without changing the data in the master directory?

Chris