[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: delta syncrepl error: "text=sync cookie is stale"



* Howard Chu <hyc@symas.com> [20060504 14:13]:
> Ben Poliakoff wrote:
> >Somehow the contextCSN of my accesslog wasn't in sync with the
> >contextCSN of my primary db on master or my replica; master and replica
> >agreed on the contextCSN for the primary db, but the contextCSN for
> >the accesslog on my master was off.  This explains the symptoms I was
> >seeing: I could sync the entire primary db to my slave, but trouble
> >ensued once the slave started to look at the accesslog (due to the
> >conflicting contextCSNs).
> >
> >Once I got the contextCSNs in sync (by making a trivial modification to
> >an entry in my primary db) things got a lot happier.  I need to figure
> >out how I managed to get the contextCSNs out of sync in the first place.
> >  
> 
> In your first post you mentioned that you had just freshly loaded the 
> main DB. If you did that using slapadd, from an LDIF that didn't include 
> the entryCSN operational attribute, then that means the main DB had to 
> generate all new CSNs for everything. That would certainly cause it to 
> get way off from the log DB.

I had been reloading the DB, but I reloaded both the main and the
accesslog DB.  I should note that neither of the DBs were loaded from
LDIF that had contextCSNs (although they definitely did have entryCSNs).
It is possible that the problem might have stemmed from a slapd segfault
segfault that occured at one point yesterday:

    slapd[28896]: segfault at 0000000000000364 rip 000000005583dd89 rsp 00000000aa03c71c error 4

I only saw the above segfault later on the console of the server,
after I'd already restarted the master slapd.  When I restarted the
master slapd I did notice that it was doing an autorecovery of the BDB.
Unfortunately I don't have any further information about the segfault.

At any rate, since then I've rebuilt 2.3.21 with the patches that Quanah
pointed me to and I've made sure that my contextCSNs for the main DB,
the accesslog and the replicas all agree.  So far it's all running
without a hitch (now I just need to start doing some load testing).

> >Thanks again for your help, it was right on the mark.  
> >
> >And once again as someone new to syncrepl and delta syncrepl, I just
> >have to saw "Wow!".  Delta syncrepl is *very* cool both in design and in
> >practice.  Updates to slaves are exceedingly fast.  Thanks to all the
> >devs and bug reporters for bringing OpenLDAP to where it is today.
> >  
> 
> Anything to get rid of slurpd.... 

Hah! :)

Thanks for your insights.

Ben