[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Multimaster further work



Derek Simkowiak wrote:
> The current practice of depending on a single master writer (with
> basic failover) does not scale well enough, in my opinion,

Well, it's "just a master", so scalability doesn't matter that much. You 
can leave him alone with write requests, so if you power him down, your 
slaves will get a problem too anyway. :o)

> and furthermore, in mission-critical applications this practice
> provides a single point of failure for the length of time it takes to
> fail over to a backup.

Well, yes, but what is failing are just updates, which will be delayed 
for some seconds, up to a minute or so. Service (The "read requests") 
isn't necessarily influenced directly by that.

> "MAX_LAG_TIME" is defined to be the maximum amount of time that it can
> take for any one master to talk to another master

> But the point is, if we know that MAX_LAG_TIME has passed since the
> last write for a particular DN, then we know that there is no
> conflicting write, because enough time has passed for us to know (for
> sure) that no other conflicting write has occurred.

No. Absolute minimum would be twice MAX_LAG_TIME. (round-trip)
Let's say WAIT_TIME is smth. like twice MAX_LAG_TIME.
Imagine, there are two servers in multimaster mode. Server A receives an 
update, replicates it to server B and waits WAIT_TIME. After nearly 
MAX_LAG_TIME server B gets an update, replicates it to server A and too 
waits WAIT_TIME. Just before server A now commits the change to its 
backend, it'll receive the change from server B. Bingo.
But:
If we don't affect the wait on server A, server B may get a third update 
within it's wait time, that server A will get _after_ it committed one 
of the first two to the backend. That's a unpredictable situation, and 
server B is likely (2:3) to behave different than server A.
I think we may avoid this, by delaying genuine updates, until we have 
committed replicated (and conflicting) ones, but I'm not quite sure 
about that. But it sounds good to keep new updates outside, until 
conflictin updates have been resolved.
That throws another question: How long should they be delayed? I'd say 
at least trice MAX_LAG_TIME. That's because servers, which get an 
update replicated to, have to wait WAIT_TIME until they can commit the 
changes. And because it may have lasted MAX_LAG_TIME 'till they got the 
replica.

Well, that's just as far as i have thought through it. (Have to go back 
to work now. :o)

> 2. Once MAX_LAG_TIME has passed, make an MD5 Digest of the data for
> each conflicting write individually.  This is the write's "score".
> 
> 3. The DN data with the highest score wins.

I like that idea. :o)

Generally I think this topic is quite interesting and bears some fun, so 
please keep me/us uptodate on its progression. Maybe we should also 
switch the discussion to -devel list, it's off topic here, and there's 
also less traffic, so I'm not that likely to overlook your and others 
messages.


lg,
daniel
-- 
Top 10 Things to say, when you run out of good arguments:
No 10)  I like your idea. Why don't you write up a white paper and we'll
      review it at the next staff meeting?