[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: OpenLDAP 2.4.15 : CSN too old when using 4-way multimaster



Pierangelo Masarati wrote:
Howard Chu wrote:
Adrien Futschik wrote:
Considering that M1&   M3 are on the same server and therefore have
exactly the
same time, if this was a time related problem, I should'nt get any
"CSN too
old" messages between M1&M3 and M2&M4, should I ?

I have also noticed that when M1 gets a new entry and passes it to
M2&M3&M4,
when M2&M3&M4 revieve it, they also pass it to M2&M3&M4 ! I don't
understand
why this happends but it look's very much like this is what's happening,
because sometimes, M2 would have passed-it to M4, before M4 has actualy
revieved the add order from M1.

I therefore happend to notice that sometimes, entries send from M1 are
revieved in the wrong ordrer by other masters and therefore some
entries may
be skipped !!!
Yes, that makes sense. The CSN check assumes changes will always be
received in the same order they were sent from the provider. Obviously
in this case this assumption is wrong. You should submit an ITS for this.

This problem was discussed on the -devel list back in 2007; the code
ought to be using a spanning tree/routing algorithm to ensure that when
multiple routes exist for propagating a change, the change is delivered
exactly once. Unfortunately no one has spent any further time on this
issue since then.

But the CSN is supposed to guarantee that regardless of the order servers converge to the same status. In fact, if entries are received in different order but carry an entryCSN attribute that is newer, the newer should take place (and be propagated further through slapo-syncprov if MMR); if identical or older, should be ignored (and not propagated). If the modification that comes in implies something odd like a missing parent or so, glue entries should be created, to be replaced by the right entry as soon as it comes in.

Right, but you're talking about the entryCSN of a replicated entry, and not the CSN that was sent in the sync cookie. The two don't have to be the same, particularly if there are a lot of writes active on the provider.


When a consumer accepts an out of order cookie CSN then any other consumers cascaded off it will receive incomplete data. (The consumer claims to be up to date as of revision X, but it is in fact missing revision X-1.)

 In MMR, assuming
perfect symmetry, we could do something like ignoring entries that come
from a provider with the entryCSN generated by another provider, under
the assumption we will eventually get it from the right provider?  Or
better (and symmetrical) do not propagate entries whose CSN was not
generated by ourselves, under the assumption the one that generated the
CSN will propagate it?

That assumes a fully connected star topology. Such a layout won't scale; the intention is for this to work even with irregular topologies. E.g.


   A - B       G
   |   |       | \
   C - D - E - F - H

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/