[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Replication questions (was: Mirror Mode)



 
>> I suppose the *real* solution is to use the multi-mastering
capability
>> in 2.4 to keep it in sync, but use it as if it's mirror mode (i.e.
all
>> writes to a single master, with the second as a hot standby), with
the
>> MM conflict resolution kicking in if needed because someone wrote to
the
>> hot standby when they shouldn't have.
>
>That's our preferred/recommended usage. As I read somewhere else
recently, 
>"the best solution is not to have problems." Conflict resolution is
messy; 
>it's best to avoid it...

Agreed - having a single "active" master and a "hot"/active but unused
standby master solves most HA issues without introducing the conflicts a
full active-active multimaster setup creates.  But if that master is
there and accepts writes, it's inevitable that someone will some day
write to it out of ignorance, and *may* write a conflicting change, so I
see conflict resolution as a last ditch fallback for this situation (and
nothing more) to prevent corruption or breakage of replication. (Plus, I
like to close up or at least be fully aware of all the edge cases that
exist, so I know how best to avoid them :) ).

You said at one point that OpenLDAP (2.4.6?) currently does entry level
conflict resolution, and does not do attribute level conflict resolution
yet - i.e. if the entry was updated on 2 separate servers with different
updates, conflicting or not, the most recently changed version of the
*entry* wins.  If I change the cn on one master, and after that (but
before replication has occurred) I change the userpassword on another
master, then the sync up occurs, I won't see the entry with the cn and
password changed on all servers, I'll see the entry as it is in the
master most recently changed (i.e. in my example, I'll see a changed
password, but the cn will revert). Is there a roadmap/timeline for doing
attribute level conflict resolution?

Also, I was looking at the admin guide and syncprov man pages on how to
set up replication.  N-Way multi-mastering details are kinda sparce :).
Is there any documentation elsewhere on setting this up?  OR... Is the
setup exactly the same as setting up Mirror-mode (per 2.3.x), but the
2.4.x code just automatically does conflict resolution (i.e. was
mirror-mode a 2.3 feature, with multimaster transparently replacing it
in 2.4 by adding conflict resolution to mirror-mode, using the same
setup?)

Is it possible for a consumer to replicate from multiple masters?  I'm
thinking along the lines of a master server at 2 locations (for HA/DR
purposes), plus each location also has multiple read-only slave
consumers.  My first thought is that these slave servers point to the
local master, but if that master goes down, the slaves under that master
stop getting updates.  My second thought is to have a load balancer at
each site, which directs all traffic connecting to a "master ldap" vip
to route connections to the primary master if it's up, or the secondary
master if the primary is unavailable.  But... (I'm still absorbing
syncrepl and rfc 4533) will all the contextCSNs and cookies and so forth
match up well enough to allow this kind of failover for *syncrepl*?  Is
it possible, and what's the best way to set this up, such that I have
multiple masters for DR purposes, and such that the failure of any
single master does not cause some subset of my read-only slave consumers
to stop getting updated?

Syncrepl (in refreshAndPersist mode), as I understand it, generally has
the slave consumer contacting the master server, retrieving an updated
list of changes since the last time it was running (refresh), then
leaves a persistent search running that gets changed entries from the
master server as they happen (persist), so replication is near
real-time.  If the master server crashes and then is restarted or the
connection is broken/dropped (common if a load balancer is inbetween),
how well does the consumer detect this and reconnect, or do the
consumers tend to have to be restarted after this occurs?  (This is a
broken/dropped connection, *not* one cleanly closed by a master server
clean shutdown or idle timeout, and many apps have trouble detecting
this - the client still thinks it has a valid tcp connection, but
nothing is coming over it, so never gets new updates.  Does the consumer
send keepalive packets or anything to cause it to realize the connection
has died and to reconnect?)

When initializing a consumer using an LDIF backup of the master, should
this be a slapcat export to get everything needed to support syncrepl
(such as contextCSN, entryUUIDS, etc)?

Thanks,
 - Jeff