[Date Prev][Date Next] [Chronological] [Thread] [Top]

[Fwd: SyncRepl Problem 2.3.20 RefreshandPersist]





Hi All,
I've got Openldap 2.3.20 running here with 3 slaves and one master. The slaves were initially using RefreshOnly, then after
testing one moved to RefreshAndPersist as the changes seemed to be propagating faster.


What I'm seeing just now (and last night) with many users changing passwords is that the changed seem to stall going out to the slaves,
with some changes not going through at all (batches of 1-200 sometimes)
though changes previous and after the stalled change have propagated out.


Restarting the master seems to kick start it slightly.

I've a slave here which too a burst of 12 updates after the master was stopped / started and has hung again for over 20 mins now.

I've a slave I restarted after the master which updated the same burst of 12, then an additional 10 after I restarted the slave but has received
none of the 5 changes in the last 10 mins.


The last slave I switched back to using RefreshOnly which has picked up all the changes.

Syncrepl config is as follows

syncrepl        rid=2
              provider=ldap://master:389/
              binddn="replicate"
              bindmethod=simple
              credentials=#######
              searchbase="dc=example,dc=com"
              filter="(objectClass=*)"
              attrs="*"
              schemachecking=off
              scope=sub
              type=refreshAndPersist
              interval=00:00:00:15
              retry="60 10 300 +"
              starttls=critical

Any thoughts? I have one machine out of out ldap round robin I can test with / up logging if needed.


{update]

The Slave I had restarted just took 3 updates from the master, almost half an hour after it's last update, skipping 36 entries in between that have successfully gone to the other slaves. This one was still using RefreshAndPersist.

I've now restarted that slave after changing it's config to be RefreshOnly. It and the other slaves have picked up the last 10+ changes since then fine.

I've written a script to drag all the users out of ldap on the master and one of the slaves and found 1200 differences which I'm re writing
to the slaves. When writing the differences back to the master, I occasionally get the following errors on the slaves.


Mar 21 14:27:22 ldap slapd[25755]: [ID 151447 local4.debug] syncrepl_del_nonpresent: be_delete uid=user,ou=People,dc=example,dc=com (0)

The set of uids affected is different for each slave, I then get a

syncrepl_entry: be_add (0)

entry in the logs as opposed to the

syncrepl_entry: be_modify (0)

entry usually associated with the changes (These are exclusively password changes at the moment)

Thanks,
        Duncan