[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: syncrepl only working in one direction



Hi Jonathon,

On 23 Sep 2010, at 15:24, Jonathan CLARKE wrote:

> Hello Alister,
> 
> Le 23/09/2010 12:04, Alister Forbes a écrit :
>> All,
>> 
>> I have two identical servers (RHEL based VMs, server1 and server3)
>> running 2.4.23 openldap.
>> 
>> built with these options:
>> 
>> --with-tls --prefix=/etc/operator/openldap --enable-syncprov
>> --enable-syslog --enable-crypt -
>> 
>> I have the strangest problem, and am desperate for any insight you
>> might provide
>> 
>> If I make a change on server3, then everything is fine, and the
>> change is replicated to server1 If I make a change on server1 then
>> server1 changes, but no changes are seen on server 3.
>> 
>> looking at the logs, on server1, Using tcpdump to sniff the
>> connection, when a change is made on server1, it doesn't even attempt
>> to contact server3.
>> 
>> As far as I can tell the configs are identical, and I have no clue
>> whats causing this.  Any hint at all would be gratefully accepted.
>> Configs from both machines attached. server1 and server3(output of
>> ldapsearch on cn=config) Also attached, logs (olcLogLevel is Sync) of
>> the results when I change a value (olcLogLevel) on the two servers
>> (change-on-server1 and change-on-server3)
> 
> I note several things:
> 
> The retry value of your syncrepl statements is set so that only a limited number of retries will occur. It is possible that (during some downtime) slapd has used up all these retries, and given up on a particular syncrepl consumer. A restart of slapd should solve this.
> 
> Looking at the logs, it seems that server3 at least is confused as to who is who, since it is sending out the change to both server1 and itself (and then dismissing it with "CSN too old, ignoring").
> 
> However, since one of your changes is to change the log level to "stats", therefore excluding "sync", it's unclear how trustworthy these logs are...
> 
> I suggest starting over: restart both instances of slapd with -c rid=001 -c rid=003, to reset the replication status, and take it from there.
> 
> Hope this helps,
> Jonathan
> -- 

Thanks very much for this, I should have been clearer in my original mail.  Although I did make changes to the olcLogLevel in the ldapmodify commands, at the beginning of each command olcLogLevel was always set to Sync.

I did restart, with the -c options, but I'm still seeing exactly the same behaviour

Looking at my configs again, I still see only one ContextCSN on server3, and two on server1.  

Any suggestions?
Alister

> ==========================================
> Jonathan CLARKE
> ------------------------------------------
> Normation
> 44 rue Cauchy, 94110 Arcueil, France
> ------------------------------------------
> Telephone:  +33 (0)1 83 62 26 96
> ------------------------------------------
> Web:        http://www.normation.com/
> ==========================================
> 

--
Alister Forbes      Work:   +32 2 704 5762    Internal: 322 5762
a@cisco.com    TACSUNS             _.|._.|._ Cisco Systems

Please avoid sending me Word or PowerPoint attachments. See -
http://www.gnu.org/philosophy/no-word-attachments.html