[Date Prev][Date Next] [Chronological] [Thread] [Top]

2-node MMR w/ delta-syncrepl setup, out of sync, best way to resync?



I have a new MMR setup, "kil-ds-3" and "kil-ds-4". I turns out that I was missing syncprov on the cn=accesslog tree, which the guys on IRC helped me out with correcting (thanks again, JoBbZ!). But even though I corrected that, and syncrepl is working great now, kil-ds-4 is not discovering and replicating the ~15k changes made on kil-ds-3 while syncing was broken, even when restarting the server.

Howard pointed me at the -c command line argument for slapd, and I've given it a try with "slapd -c rid=002", as well as "slapd -c rid=002,sin=2,csn=0", and neither one causes the server to do a full resync, although the manpage says "-c rid=002" should be sufficient. Do the rules for that change in a mirrormode setup? Is the only real fix an "/etc/init.d/ldapd stop && rm -f ${DBDIR}/* && /etc/init.d/ldapd start"?


*Somewhat* related to that... we have an "updater" process that runs through the directory and changes departmental affiliations, organizational affiliations, entitlements, and so on. Most of the time only a few dozen records get touched, but of course about three times a year new students come in, more graduate, and so on... and on those landmark days I might see tens of thousands of entries getting updated in about 30 minutes.

With that kind of situation, what kind of value should I keep for olcSpSessionLog (syncprov-sessionlog)? If, for example, I had one side of the mirror down during this update process, would I lose any replication or performance if the sessionlog overflowed (olcSpSessionLog = 10000, and 25k changes are made)? I'm assuming the recovering node would simply fall back to a "present phase" synchronization and syncing'll just take a bit longer, and even then that will only happen if > 10k entries are *deleted* as opposed to modified. Am I understanding the process correctly?