[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Less aggressive syncrepl ?



>
> Hello list,
>
> openldap-2.3.41
> db-4.2.52.NC-PLUS_5_PATCHES
> SunOS ldapmaster01.unix 5.10 Generic_127128-11 i86pc i386 i86pc
>
> We currently have 1 master, and about 25 clients hanging off it, using
> syncrepl.
> Today we restarted the master for the first time in quite some time. This
> was to
> add an index we had forgotten. It was only added to the master.
>
> Initially, the master replies very fast to test-ldapsearch.
>
> But it appears that all 25 clients connect within the first 30seconds or
> so, and
> start the syncing process. This appears to take about 30 minutes of
> communicating back and forth. (As observed with snoop/tcpdump).
>
> Simple commandline ldapsearch connect, but never replies. I haven't even
> started
> the software that talks to ldapmaster, so it is essentially doing nothing.
> (Just
> checking everything is in sync, there should be no changes).
>
> This seems rather aggressive. I assume my syncrepl is set far too eagerly.
> Normally, syncrepl works beautifully, and updates are very fast across the
> board. But having hour long no-response from the master after a restart is
> undesirable.
>
> Can someone suggest better values for our syncrepl?
>
> Master has:
>
> lastmod         on
> checkpoint 128 15
> cachesize 10000
> overlay syncprov
> syncprov-checkpoint 100 10
> syncprov-sessionlog 100
>
>
> Slaves has: (RID is based on IP's last octet + 256)
>
> lastmod         on
> checkpoint 128 15
> cachesize 10000
> syncrepl   rid=279
>                  provider=ldap://172.20.12.113
>                  type=refreshAndPersist
>                  interval=00:00:00:30
>                  searchbase="dc=company,dc=com"
>                  filter="(objectClass=*)"
>                  attrs="*"
>                  scope=sub
>                  schemachecking=off
>                  updatedn="cn=admin,dc=company,dc=com"
>                  bindmethod=simple
>                  binddn="cn=admin,dc=company,dc=com"
>                  credentials="OurSecret"
>                  retry="60 10 300 +"
>
> # wait 60s then retry connect 10 times, then wait 300s forever
> updateref       ldap://172.20.12.113

25 consumers doing a full refresh probably ate up all threads available on
the producer.  You should either cascade your consumers (build a
replication chain where a layer of consumers acts as producers for the
remaining), or increase the number of threads on the producer.

p.