[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Syncrepl: full sync vs. delta



Quanah Gibson-Mount wrote:
 I recently had to process some 40,000+ modifications through my set
 of directory servers.  Initially, I tested my changes through my dev
 boxes (2.3/HEAD using syncrepl).  This run took approximately 2
 hours, but the slaves were modified essentially as the master was.
 This length of time concerned me somewhat going forward, but on my
 production boxes (slurpd master->slave) it took only 45 minutes.

 I discussed this time disparity with Howard, and he made some
 modifications to syncrepl (full sync mode) that allowed the master to
 take the changes in around 16 minutes.

This is actually the change that Ralf Haferkamp asked for in ITS#3671, to always queue the psearch responses so that the original operation never has to wait for them. Now, for each active psearch a task is submitted to the runqueue to send out the responses. *


 However, the slaves still took several hours to catch up on these
 modifications, which meant they were out of sync for long periods of
 time.  Howard and I then discussed putting together the syncrepl
 delta method, using the accesslog backend (as previously discussed on
 -devel).  This worked (after some bugs in accesslog were fixed),
 where it took 37 minutes to push the updates through the master.  2
 of the 3 replica's finished within a few seconds of the master, and
 the 3rd slave finished within 5 minutes of the master.

There's obviously a reduction in network traffic from using the accesslog format, but I think the major bottleneck before was the slaves fully re-indexing all of the modified entries. By processing only the deltas, much less re-indexing work was needed, which allowed the slaves to keep up better with the master.


Of course, the update time on the master slowed down from 16 minutes to 37 minutes because it had to write to the accesslog db. (But compared to the 45 minutes on the production systems running OpenLDAP 2.2, this is still pretty good. Especially considering that 2.2 is just fopen'ing a flat text file and appending records to it, vs all of the work that accesslog on top of back-bdb does.)

 However, the problem with the way we currently have to set up
 delta-syncrepl (via accesslog) is that there is no way for a slave to
 become fully refreshed if its contextCSN is out of date.  It looks
 like this would take an extension to the syncrepl protocol for this
 to be done properly.  Objections? comments?

I wouldn't say "no way" (since I already outlined a way to do it) but it is certainly clunky. I believe it would be cleaner if the peristent log was an integral feature of the sync provider, and it would be more efficient if a delta mode was an integral part of the syncrepl protocol.


There's an obvious problem with the current implementation - it's using the ChangeSequenceNumbers as if they were CommitSequenceNumbers, and that can cause a problem during Refresh phase if a write op completed out of order.

I believe the only reason a write in back-bdb/hdb would complete out of order is because the transaction was randomly selected to resolve a deadlock, so it had to abort and retry, and some newer op managed to complete in the intervening time. I think there are two possible solutions here:
1) get a new CSN whenever a write op needs to abort and retry.
2) change the deadlock detector strategy to always select the youngest transaction to abort.


I'm not sure that (2) is sufficient by itself, but (1) ought to be.

* The cn=Runqueue,cn=Threads,cn=Monitor entry in back-monitor lets you see what tasks are currently executing on the runqueue. Perhaps we should extend this to also show the idle tasks, and maybe provide a trigger to force an idle task to start executing immediately. Of course at present I can't think of any tasks that would need to be kicked manually...
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/