[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4622) syncrepl operations incomplete when consumer restarted



Halbritter, Matthias wrote:
> I tested the changes to syncrepl with the recently released 2.3.25. Here are
> my results:
>
> 1) restarting consumer during refresh phase: consumer resumes the refresh:
> OK(resolved)
> 2) restarting provider during refresh phase: consumer resumes the refresh:
> OK(resolved), unless entryCSN is not indexed: in this case the consumer
> seems to be trapped in some kind of loop searching its entire database
> repeatedly for (?):
>   
> This search seems to go on forever. At the end of the database tree the
> search simply starts all over again with rootdn. If this is expected
> behaviour, there should be a hint in the documentation that it's not only
> advisable to index entryCSN & entryUUID for better performance but may also
> be vital for keeping databases synchronized with syncrepl.
>   

It's not an endless loop, but it is an exhaustive search of the 
database, probably for every entry. This will obviously take O(n^2) time 
without indexing. The documentation already recommends the indexing. The 
search will not go on forever, although it may not complete in an 
acceptable time frame. While indexing is only an optimization, it's 
obviously more desirable in some cases.

> 3) restarting provider during persist phase while sync operations are being
> passed on from provider to consumer: consumer resumes sync operations after
> provider restart: OK
> 4) restarting consumer during persist phase while sync operations are being
> passed on from provider to consumer: it works for adding entries, not for
> deleting entries, though: only the transmitted changes up to the
> consumer-stop are processed; the remaining changes seem "lost".
> How to reproduce: provider and consumer hold synchronized databases;
> syncrepl runs in refreshAndPersist mode; start deleting a branch in the
> provider's database; while the provider is still busy deleting the dn and
> all its children, restart the consumer.
> Can that be resolved as well?
>   

I've reproduced this situation a couple of times, but not yet 
consistently. It will take some more time to find out what's going wrong.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/