[Date Prev][Date Next] [Chronological] [Thread] [Top]

syncrepl: large datasets and expediting consumer's initialization



I'm having trouble getting the consumer synced in reasonable time. My tests were with fewer than 20 entries in the datastore and I saw no problems.

But we have 260,000 inetOrgPersons (with only a few attributes for each user: uid cn sn givenName mail userPassword).

I've set up syncrepl:

PROVIDER
# Indices to maintain for this database
index objectclass,entryCSN,entryUUID eq
index ou,cn,mail,surname,givenname      eq,sub
index uidNumber,gidNumber,loginShell    eq
index uid,memberUid                     eq,sub
index nisMapName,nisMapEntry            eq,sub

overlay syncprov
syncprov-checkpoint 100 1
syncprov-sessionlog 100

limits dn.children="ou=replicators,dc=service,dc=utoronto,dc=ca" size=unlimited time=unlimited

(I index attributes I'm not currently using. I presume that's not the problem.)

CONSUMER
syncrepl rid=123
    provider=ldap://PROVIDER:389
    type=refreshAndPersist
    interval=00:00:10:00
    retry="60 10 300 +"
    searchbase="dc=service,dc=utoronto,dc=ca"
    filter="(objectClass=*)"
    scope=sub
    schemachecking=off
    starttls=critical
    bindmethod=simple
    binddn="uid=replicator,ou=replicators,dc=service,dc=utoronto,dc=ca"

I've tried with and without slapcat/slapadd to initialize the consumer. On our slower system, slapadd took 98 minutes to rebuild the database; the faster was 35 minutes (and I have only one consumer right now).

A full transfer via syncrepl is slow: 10 entries per second:

# < /var/log/daemon  egrep 'bdb_add: added id=' | cut -b1-50 | uniq -c | tail -10
    10 Apr  1 10:57:31 ldap2 slapd[3126]: bdb_add: added
    11 Apr  1 10:57:32 ldap2 slapd[3126]: bdb_add: added
     9 Apr  1 10:57:33 ldap2 slapd[3126]: bdb_add: added
     9 Apr  1 10:57:34 ldap2 slapd[3126]: bdb_add: added
    10 Apr  1 10:57:35 ldap2 slapd[3126]: bdb_add: added
     9 Apr  1 10:57:36 ldap2 slapd[3126]: bdb_add: added
    10 Apr  1 10:57:37 ldap2 slapd[3126]: bdb_add: added
    10 Apr  1 10:57:38 ldap2 slapd[3126]: bdb_add: added
    10 Apr  1 10:57:39 ldap2 slapd[3126]: bdb_add: added
     5 Apr  1 10:57:40 ldap2 slapd[3126]: bdb_add: added

With the consumer using the slapadded initial database, syncrepl seems to be reviewing every entry, 15 entries per second:

# tail -10000 /var/log/daemon  | egrep 'entry unchanged' | cut -b1-83 | uniq -c
     8 Apr  1 11:30:57 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    15 Apr  1 11:30:58 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    14 Apr  1 11:30:59 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    15 Apr  1 11:31:00 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    15 Apr  1 11:31:01 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    15 Apr  1 11:31:02 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    14 Apr  1 11:31:03 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    14 Apr  1 11:31:04 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
    15 Apr  1 11:31:05 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored
     2 Apr  1 11:31:06 ldap2 slapd[3782]: syncrepl_entry: rid=123 entry unchanged, ignored

I can only hope it'll be done in 5 hours. The datastore isn't active, so the consumer is up to date, for now, but this needless work is time consuming.

Is this normal?
What happens when I restart the consumer? Why should I expect it to be faster restarting?

I changed one entry (adding displayName to my own entry) after the slapcat, so the consumer did not have the change when the consumer started 90 minutes ago. That update has not yet propagated.

Can I prime the consumer's syncrepl cookie (if that's an appropriate term)? Is that a solution? And how would I do that?

Thanks for your time,

Paul