[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#5985) replication lockout with syncrepl



Full_Name: Quanah Gibson-Mount
Version: 2.3/2.4/HEAD
OS: Linux 2.6
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (75.111.29.239)


I noticed back in testing with OpenLDAP 2.3 that if a master gets a high rate of
changes, and you have 3+ replicas, usually 2 replicas will end up getting all of
the changes while the 3rd+ replicas have to wait until those 2 finish before
getting changes.  If the high rate of changes goes on for a long enough period
of time, this can cause the other replicas to get so far out of sync that it is
more efficient to reload them than to wait on them to re-sync.  I discussed this
with Howard, and in reviewing the code, he sees there's an underlying design
issue with updates that is causing this.  His comments:

Once a thread for a psearch wakes up, it sends all the changes that were queued
so it may hog an entire thread for a long time before the next psearch comes off
the queue