[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Replication not working



I think using delta-syncrepl is good advice, I've tried it out and don't see any of the problems I've seen with plain syncrepl.
However, ... I am having problems upgrading my production system from using plain syncrepl to delta-syncrepl. We can't afford downtime on the production system, so I've just been testing the upgrade in a dev environment, and so far I've not managed a clean upgrade.
The production system has 4 LDAP servers set up for MMR syncrepl, though in reality only one of the servers is actively used, the others are currently only for failover in case the active server fails (this may well change in the future).
The upgrade has to create a cn=accesslog db and syncprov overlays, and modify the syncrepl attributes of the main db to use access log (cn=config still uses plain syncrepl).
The first problem I get is that when the creation of the cn=accesslog db and sync prof overlays is replicated, the changes are applied out of order, so that syncrepl tries to create the sync prof overlay before the accesslog db to which it refers has been created, and breaks replication of the changes.
I can work round this by creating the accesslog db, waiting for that to replicate and then creating the syncprov overlay, but this is still annoying and complicates the upgrade process unnecessarily.
I see the next problem when modifying the syncrepl attributes to refer to accesslog, and so far I can't work round it consistently and confidently enough to try it out in production.
Because it is an MMR setup, there is one syncrepl attribute for each server. I can modify one of these attributes, and it replicates with no problem. But as soon as I change the next attribute, one of the servers starts continually logging an error message to the slapd log, indicating that another server requires a refresh. The only way I have been able to cure this is by deleting the main db and the accesslog db and letting replication regenerate them. But this doesn't always seem to work and in any case is not really practical in the production environment where the main db has 5.5 million DNs and takes up nearly 20 Gb.
This second problem doesn't happen consistently, but because I don't understand why it happens or how to fix it consistently, I can't go ahead with the production upgrade to delta-syncrepl, which is very frustrating.
We are currently running openldap 2.4.31, but I do plan to see if 2.4.33 or RE24 behaves better. However, looking at the openldap sources I haven't spotted any fixes which look likely to help.
Any ideas?

Chris

> Date: Tue, 15 Jan 2013 09:13:02 -0800
> From: quanah@zimbra.com
> To: beni.anil@gmail.com
> Subject: Re: Replication not working
> CC: openldap-technical@openldap.org
>
> --On Tuesday, January 15, 2013 7:56 PM +0530 anil beniwal
> <beni.anil@gmail.com> wrote:
>
>
> > Even when i tried with blank db
> >
> > it initally started and then stopped.
> >
> > i got errors like
> >
> >
> > dn_callback : entries have identical CSN
> >
> >  syncrepl_entry: rid=111 entry unchanged, ignored
>
> If you continue to ignore my advise to use delta-syncrepl instead of
> standard syncrepl, then you can expect to continue to have problems. Also,
> since you are using MDB, grab the latest OpenLDAP code from RE24.
>
> --Quanah
>
>
> --
>
> Quanah Gibson-Mount
> Sr. Member of Technical Staff
> Zimbra, Inc
> A Division of VMware, Inc.
> --------------------
> Zimbra :: the leader in open source messaging and collaboration
>