[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: syncrepl only working in one direction


Le 24/09/2010 07:31, Alister Forbes a écrit :
Hi Jonathon,

On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:

Hello Alister,

Le 23/09/2010 12:04, Alister Forbes a écrit :

I have two identical servers (RHEL based VMs, server1 and server3)
running 2.4.23 openldap.

built with these options:

--with-tls --prefix=/etc/operator/openldap --enable-syncprov
--enable-syslog --enable-crypt -

I have the strangest problem, and am desperate for any insight you
might provide

If I make a change on server3, then everything is fine, and the
change is replicated to server1 If I make a change on server1 then
server1 changes, but no changes are seen on server 3.

looking at the logs, on server1, Using tcpdump to sniff the
connection, when a change is made on server1, it doesn't even attempt
to contact server3.

As far as I can tell the configs are identical, and I have no clue
whats causing this.  Any hint at all would be gratefully accepted.
Configs from both machines attached. server1 and server3(output of
ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of
the results when I change a value (olcLogLevel) on the two servers
(change-on-server1 and change-on-server3)
I note several things:

The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.

Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").

However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...

I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.

Hope this helps,
Thanks very much for this, I should have been clearer in my original mail.  Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.

I did restart, with the -c options, but I'm still seeing exactly the same behaviour

Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.

Any suggestions?

In this case, I suspect something is wrong with your DNS/IP setup. slapd identifies itself against the values in olcServerID by checking the host's FQDN (see the output of hostname --fqdn) and the hostnames in the -h option passed on startup.

Make sure your /etc/hosts contains sensible values for all names and IPs involved, and that you're running both slapds with something like -h ldap://serverN.example.com/.

If this still fails, maybe post a log excerpt from slapd startup with log levels config and sync?


Jonathan CLARKE
44 rue Cauchy, 94110 Arcueil, France
Telephone:  +33 (0)1 83 62 26 96
Web:        http://www.normation.com/