[Date Prev][Date Next] [Chronological] [Thread] [Top]

Relaxed mode for delta-sync MMR

To: openldap-devel@openldap.org
Subject: Relaxed mode for delta-sync MMR
From: Quanah Gibson-Mount <quanah@zimbra.com>
Date: Wed, 10 Jun 2015 13:04:12 -0700
Content-disposition: inline
Dkim-filter: OpenDKIM Filter v2.9.2 edge02.zimbra.com 4F710A6252
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zimbra.com; s=C2AA288C-EE47-11E2-9BB0-E820BDD9BDBF; t=1433966687; bh=BriYHPwEN+3Va5EW/yr399N+hUcvKegcVhOVvahIXHA=; h=Date:From:To:Subject:Message-ID:MIME-Version:Content-Type: Content-Transfer-Encoding; b=mgBz0/GiEcTGToj7tDqMVyQmc3jwYw/JWuq3VNdDZi6Benwz9X7WV8N4J5ga28AwC rNFrdjhf3uXHcorRVO4DS+sBK62RN7PZri+636L6LA2Yq6+IouDYa+JZLt1fWHTy+i kBlg62rFmbYOrKPvxkCg7b+mNKAPSFKQlMlSDLxw=

After having deployed delta-sync MMR at several customer sites, the generalhandling of conflict resolution in MMR mode is significantly sub optimal,and routinely causes the MMR nodes to get further out of sync, worseningthings significantly (Mainly due to ITS#8125).


The main issues I see are the following:

a) Two masters get different change requests at approximately the same timeto add a value X to an attribute.

b) Two masters get different change requests at approximately the same timeto delete a value X from an attribute.

In these two specific cases, in relaxed mode, rather than falling back andre-syncing the entire database, I think the conflict should be discarded(skipped), and logged as such. I.e., there is no actual discrepancy in theobject. It still has X present in the add case, and X gone in the deletecase.

At best, if we're going to do fallback, then we should only see aboutresyncing the specific entry. The overall behavior I'm seeing fromOpenLDAP is the masters get in an endless cycle of re-sync, and the morethey do so, the more out of sync they become, leading to a point at whichyou have to stop all masters, export all their DBs, sort them, find missingentries between all sets of masters, and build a brand new DB with which toreload them, until they get massively out of sync again. I.e., the currentstrategy of resync is doing no favors to anyone. It may work OK on verysmall DBs, where a resync only takes seconds, but on larger dbs were suchsyncs take 30+ minutes to hours, it is not a useful methodology.


--Quanah

--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Prev by Date: Re: X509_V_FLAG_PARTIAL_CHAIN support in OpenLDAP
Next by Date: Update/drop cruft: LDAP_DEPRECATED, CLDAP, C version
Index(es):
- Chronological
- Thread