[Date Prev][Date Next] [Chronological] [Thread] [Top]

Antw: syncrepl error (53) with 3-way delta-mmr (consumer state is newer than provider)



>>> Sven Mäder <maeder@phys.ethz.ch> schrieb am 30.08.2017 um 14:51 in
Nachricht
<2c527361-fd25-9002-1aa5-96ba00a69135@phys.ethz.ch>:
> Hi
> 
> We have a 3-way delta-mmr syncrepl setup (Debian Stretch with slapd
> 2.4.44+dfsg-5+deb9u1).
> 2 of those 3 hosts were powered off for about 4 hours. After the bootup
> and slapd start,
> the host which was running all the time during the downtime started to log:
> 
>     SEARCH RESULT tag=101 err=53 nentries=0 text=consumer state is newer
> than provider!
> 
> Purging the accesslog database fixed the issue.
> 
> Could this have happened due to a timesync problem? We noticed, that
> right after boot,
> the ntpd service was oscillating in its time offset from 0.0192 to
> 0.0003 for ~3 minutes.
> 
> Does somebody have experience with this?
> 
> Do we need to delay slapd or force an `ntpdate` before slapd starts in
> the boot process?
> Because slapd has the following LSB headers in the init script
> 
>     # Required-Start:    $remote_fs $network $syslog
> 
> it is started (using systemd service file autogenerated from init.d
> script) right after
> network.target has been reached and simultaneously with ntpd. Whereas
> slapd only takes
> about 1 second to start, ntpd takes about 10 seconds and it might even
> take much longer
> to get the time in sync.

Hi!

Some of the time ntpd needs to sync may be host name resolution (if you use
names). Methods to speed up initial synchronization inlude "iburst", "minpoll"
and adding a large crowd of servers. Note that reducing minpoll could reduce
the final accuracy (just as increasing "maxpoll" does). Depending on your
network and load I would not rely on a time offset less than a few ten
milliseconds. How well LDAP can operate then is a different question.

Also note for Linux (on most platforms) and NTP one problem is that the
frequency correction needed for the clock can vary significantly between boots;
thus the tijme for "perfect sync" can be quite long. See attached image for an
example.

Updating one entry on different servers within a very short time (shorter than
the time of syncing) will probably cause trouble. What real-life situation
causes such?

Regards,
Ulrich

> 
> Kind regards
> 
> -- Sven Mäder IT Services Group Physics Department, ETH Zurich



Attachment: h01-yearly-freq.png
Description: PNG image