[Date Prev][Date Next] [Chronological] [Thread] [Top]

MMR/contextCSN missing



Hi-
 I’ve been working on this problem for a couple of days, the manual pages/admin guide/logs and I are now best buddies, but still I fail. Any help you can offer would be much appreciated.

I’m trying to set up a 4-host MMR cluster using 2.4.39 (LTB build, running on Ubuntu 12.04). With the config I have below (which is the same on all hosts), I’m seeing this peculiar behavior where all of the servers attempt to perform a full sync with each other over and over again. They stay at a relatively high load as a result. The logs (at default level) show this over and over again:

ul 29 10:40:55 eadrax slapd[3815]: conn=1000 op=1 SRCH base="dc=ccs,dc=neu,dc=edu" scope=2 deref=0 filter="(objectClass=*)"
Jul 29 10:40:55 eadrax slapd[3815]: conn=1000 op=1 SRCH attr=* +
Jul 29 10:40:56 eadrax slapd[3815]: conn=1000 op=1 SEARCH RESULT tag=101 err=0 nentries=16551 text=
Jul 29 10:40:56 eadrax slapd[3815]: conn=1000 op=2 SRCH base="dc=ccs,dc=neu,dc=edu" scope=2 deref=0 filter="(objectClass=*)"
Jul 29 10:40:56 eadrax slapd[3815]: conn=1000 op=2 SRCH attr=* +
Jul 29 10:40:58 eadrax slapd[3815]: conn=1000 op=2 SEARCH RESULT tag=101 err=0 nentries=16551 text=
...

If I delete the databases from a server and  bring the server to a loglevel of 16384, I see the initial re-sync proceed as I would expect (all of the data is replicated) but then the full sync process appears to repeat again and the logs show entries like:

syncrepl_entry: rid=003 entry unchanged, ignored (...
and
dn_callback : entries have identical CSN ...

My first thought was to check the contextCSN on the servers, and indeed something is peculiar because ldapsearch authorizing with the rootDN (while running) and slapcat (while at rest) show that there is no contextCSN attribute associated with the main database (there is one in cn=accesslog). I have confirmed with cn=monitor that the main database does indeed show the syncprov and syncrepl overlays loaded. I have changed log levels to see if the config files are being parsed ok (they are). I have changed values for syncprov-checkpoint. I have attempted to just have two of the four talk to each other to see if a simpler case would help illuminate what is going on, but to no avail. There are no weird errors in the log. At this point I don’t know what else to try.

Here are the relevant sections from my configs, do you spot anything untoward that might be causing this behavior?

=== slapd.conf excerpt:

# {serverN} is replaced with a real name in the configs
serverId 1 ldaps://{server1}.ccs.neu.edu:636/
serverId 2 ldaps://{server2}.ccs.neu.edu:636/
serverId 3 ldaps://{server3}.ccs.neu.edu:636/
serverId 4 ldaps://{server4}.ccs.neu.edu:636/

include /usr/local/openldap/etc/openldap/slapd.conf.acl

database        mdb
suffix          "dc=ccs,dc=neu,dc=edu"
rootdn          “XXXX”


include /usr/local/openldap/etc/openldap/slapd.conf.index
include /usr/local/openldap/etc/openldap/slapd.conf.replicas

# {repluser} is replaced with a real name in the actual configs
limits dn.exact=cn={repluser},dc=ccs,dc=neu,dc=edu
    time.soft=unlimited
    time.hard=unlimited
    size.soft=unlimited
    size.hard=unlimited

overlay syncprov
syncprov-checkpoint 100 10
syncprov-reloadhint FALSE
syncprov-nopresent  FALSE

overlay accesslog

logdb "cn=accesslog"
logops writes
logpurge 07+00:00 01+00:00
logsuccess TRUE

index reqstart eq

database mdb
suffix          "cn=accesslog"
rootdn          “XXXX”

index default eq
index entryCSN,entryUUID,objectClass,reqEnd,reqResult,reqStart

# {repluser} is replaced with a real name in the actual configs
limits dn.exact=cn={repluser},dc=ccs,dc=neu,dc=edu
        time.soft=unlimited
        time.hard=unlimited
        size.soft=unlimited
        size.hard=unlimited

overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 500
syncprov-reloadhint TRUE
syncprov-nopresent  TRUE

=== slapd.conf.acl excerpt:

access to *
    by dn=cn={repluser},dc=ccs,dc=neu,dc=edu read
    by * break

=== slapd.conf.replicas excerpt:

# {serverN} is replaced with a real name in the configs
syncrepl rid=001    provider="ldaps://{server1}.ccs.neu.edu:636/"
    searchbase="dc=ccs,dc=neu,dc=edu"
    syncdata="accesslog"
    logbase="cn=accesslog"
    logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
    bindmethod="sasl"
    saslmech="EXTERNAL"
    type="refreshAndPersist"
    retry="10 +"
    timeout="1"
    keepalive="180:3:60"
    network-timeout="10"
    schemachecking="on"
syncrepl rid=002    provider="ldaps://{server2}.ccs.neu.edu:636/"
    searchbase="dc=ccs,dc=neu,dc=edu"
    syncdata="accesslog"
    logbase="cn=accesslog"
    logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
    bindmethod="sasl"
    saslmech="EXTERNAL"
    type="refreshAndPersist"
    retry="10 +"
    timeout="1"
    keepalive="180:3:60"
    network-timeout="10"
    schemachecking=“on"
… 
(other 2 hosts, same format)

=== slapd.conf.index:
index cn eq,sub
index entrycsn eq
index entryuuid eq
index mail sub
index member eq
index objectclass eq
index sn eq,sub
index uid eq,sub

Thanks for any help you can offer!

   — dNb