[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Mirror Mode, MMR and replicas

Quanah Gibson-Mount wrote:
A number of our clients have requested "fail-over"/redundancy capabilities
for the LDAP master, and as I'm currently working on moving our product to
use OpenLDAP 2.4, this becomes a distinct possibility.  However, I have
some questions about the viability/reliability/effectiveness of using
multiple masters combined with replicas.  I don't see these answered in the
Admin Guide.

You mean, setting up regular read-only replicas slaved to the masters?

I'll start with replication under MMR.

As I understand it, the replicas can only point at a single master.


So, if
I have a 2 master MMR setup, I assume I would want to point half my
replicas at master A and the other half at master B for their updates.
This leads to a problem in my mind, in that if master A goes down, then
half of my replica pool is now going to remain completely out of sync with
the remaining master until master A is recovered.  Throwing a Load balancer
in front of the two masters, and pointing the replicas at that instead, is
not a viable option because the two masters may be getting updates in a
different sequence, so if a replica disconnects from the LB and then
reconnects, the updates it could get fed from whatever master the LB is
pointing at could lead to inconsistencies.

What inconsistencies? Each master's changes are stamped with its own sid. Any consumer is going to know about the contextCSNs of each master it talks to.

Neither of these seem like a
good option.  I don't see a good solution here to resolve this issue,
either, unless the replica could somehow know which master it had been
talking to,

The replica always knows which master it's talking to...

and drop into refresh mode if it found itself talking to a new

Drop into refresh mode? Obviously in persist mode the consumer keeps a connection open to a specific master; a load balancer can't move an open connection. So obviously, if a particular master disappears, all of its clients are going to lose their connections and any consumers set up to retry are going to have to initiate new sessions. And every new replication session starts with a refresh phase. So this recovery is already automatic, it always has been.

 I'm also not clear on what happens if your replicas are
delta-syncrepl based, rather than normal syncrepl, in the LB setup.

Not possible. Current delta-sync requires all updates to be logged in order; in an MMR setup you can't guarantee order so *nobody* can use delta-sync in this scenario.

For Mirror Mode, I would assume you could point the replicas at the LB
fronting the two masters, since only one master is ever receiving changes.
I also assume delta-syncrepl would be a completely valid option for
replication to the replicas, again because only one master is getting the
updates, so all updates would be logged in the same sequence on both
servers.  However, I don't know if this is correct or not, or if there are
limitations here I haven't considered.  When I was first pondering this on
the #openldap-devel channel in IRC, Matt Backes made a comment about
delta-syncrepl not working with Mirror Mode.

For MirrorMode, delta-sync should work since there is only ever one source of changes, and they will be logged in order. There is a window of vulnerability where a server crashes after committing changes to its accesslog, before it replicates them to the mirror. Those changes will be temporarily lost, and create a gap in the mirror's log. When the original server comes back up, the mirror will receive those lost changes, but the strict ordering of its log will be broken. In this case though, the delta-sync consumer will be fine - if the lost changes caused no conflicts, they will simply be committed. If they do cause a conflict, the consumer will just fallback to refresh mode and the conflicts will be erased.

So, basically, I'm at a loss if my understanding things is correct, on how
I provide a consistent replicated environment for my customers, while also
providing master/master failover.

This appears to have been a -software question, not a -devel question. Perhaps you should summarize back to the -software list and end this thread here.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/