[Date Prev][Date Next] [Chronological] [Thread] [Top]

OpenLDAP syncrepl woes



I'm trying to stabilize our openldap server farm before going live and
am finding that despite the contextCSN matching between providers and
replicas, the actual content of the server is getting out of sync.
This is most prominent when we are testing our population routine and
we need to remove all accounts before starting. right now it's only
about 22000 entries (It will get much larger).

During the mass delete we got the following sprinkled throughout the
logs on all machines:
====
Nov 15 15:47:16 idm-prod-ldap-2 slapd[33070]: bdb(dc=domain,dc=name):
previous transaction deadlock return not resolved
Nov 15 15:47:16 idm-prod-ldap-2 slapd[33070]: => bdb_idl_delete_key:
cursor failed: Invalid argument (22)

and the various replicas would still have accounts left over but they
wouldn't match each other.

Granted the above issues might be explained away in that we don't yet
have enough ram on the machines yet, however it does seem to present
us with a problem when we notice the discrepancy, how do we during run
time re-sync the data from the provider server? I have tried the slapd
-c rid=2,csn=20111114000000.000000Z but that doesn't seem to do any
good. (I've tried several different values of csn=0
csn=20111114000000.000000Z#000000#000#000000 etc. to no avail)

I guess my question is two fold, how do I really verify replication is
working properly and is in sync, and how to I force a replica to just
take the current content from a provider without question. (I don't
really want to remove the database and have it re-sync, rather have it
go through and check the content and update as needed).

Thanks
Jeffrey Crawford