[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: contextCSN of subordinate syncrepl DBs

Rein Tollevik wrote:
> I've been trying to figure out why syncrepl used on a backend that is 
> subordinate to a glue database with the syncprov overlay should save the 
> contextCSN in the suffix of the glue database rather than the suffix of 
> the backend where syncrepl is used.  But all I come up with are reasons 
> why this should not be the case.  So, unless anyone can enlighten me as 
> to what I'm missing, I suggest that this be changed.
> The problem with the current design is that it makes it impossible to 
> reliably replicate more than one subordinate db from the same remote 
> server, as there are now race conditions where one of the subordinate 
> backends could save an updated contextCSN value that is picked up by the 
> other before it has finished its synchronization. An example of a 
> configuration where more than one subordinate db replicated from the 
> same server might be necessary is the central master described in my 
> previous posting in 
> http://www.openldap.org/lists/openldap-devel/200806/msg00041.html
> My idea as to how this race condition could be verified was to add 
> enough entries to one of the backends (while the consumer was stopped) 
> to make it possible to restart the consumer after the first backend had 
> saved the updated contextCSN but before the second has finished its 
> synchronization.  But I was able to produce it by simply add or delete 
> of an entry in one of the backends before starting the consumer.  Far to 
> often was the backend without any changes able to pick up and save the 
> updated contextCSN from the producer before syncrepl on the second 
> backend fetched its initial value.  I.e it started with an updated 
> contextCSN and didn't receive the changes that had taken place on the 
> producer.  If syncrepl stored the values in the suffix of their own 
> database then they wouldn't interfere with each other like this.


> There is a similar problem in syncprov, as it must use the lowest 
> contextCSN value (with a given sid) saved by the syncrepl backends 
> configured within the subtree where syncprov is used.  But to do that it 
> also needs to distinguish the contextCSN values of each syncrepl 
> backend, which it can't do when they all save them in the glue suffix.
> This also implies that syncprov must ignore contextCSN updates from 
> syncrepl until all syncrepl backends has saved a value, and that 
> syncprov on the provider must send newCookie sync info messages when it 
> updates its contextCSN value when the changed entry isn't being 
> replicated to a consumer.  I.e as outlined in the message referred to above.

Then (at least) at server startup time syncprov must retrieve the contextCSNs
from all of its subordinate DBs. Perhaps a subtree search with filter
"(contextCSN=*)" would suffice; this would of course require setting a
presence index on this attribute to run reasonably. (Or we can add a glue
function to return a list of the subordinate suffixes or DBs...)

By the way, please use "subordinate database" and "superior database" when
discussing these things; "glue database" is too ambiguous.
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/