[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4622) syncrepl operations incomplete when consumer resta rted



> This is now fixed in CVS HEAD, see the recent patches for slapd/ 
> syncrepl.c and overlays/syncprov.c. You probably don't need the provider 
> -side patch in your current configuration, it's only relevant for
cascading.

Sorry for not having checked sooner (I was very busy at work), even though
changes were made to the source code in almost no time (thanks). In the
meantime issue #4622 was closed.

I tested the changes to syncrepl with the recently released 2.3.25. Here are
my results:

1) restarting consumer during refresh phase: consumer resumes the refresh:
OK(resolved)
2) restarting provider during refresh phase: consumer resumes the refresh:
OK(resolved), unless entryCSN is not indexed: in this case the consumer
seems to be trapped in some kind of loop searching its entire database
repeatedly for (?):

--snip--
syncrepl_entry: LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
=> bdb_search
bdb_dn2entry("o=example")
search_candidates: base="o=example" (0x00000001) scope=2
=> bdb_dn2idl("o=example")
=> bdb_filter_candidates
	AND
=> bdb_list_candidates 0xa0
=> bdb_filter_candidates
	EQUALITY
=> bdb_equality_candidates (entryUUID)
<= bdb_equality_candidates: (entryUUID) index_param failed (18)
<= bdb_filter_candidates: id=-1 first=1 last=352
<= bdb_list_candidates: id=-1 first=1 last=352
<= bdb_filter_candidates: id=-1 first=1 last=352
bdb_search_candidates: id=-1 first=1 last=352
=> test_filter
    EQUALITY
=> access_allowed: search access to "o=example" "entryUUID" requested
<= root access granted
<= test_filter 6
=> bdb_dn2id_children("o=example")
<= bdb_dn2id_children("o=example"):  (0)
entry_decode: "ou=nbg,o=example"
<= entry_decode(ou=nbg,o=example)
=> bdb_dn2id("ou=nbg,o=example")
<= bdb_dn2id: got id=0x00000002
=> test_filter
    EQUALITY
=> access_allowed: search access to "ou=nbg,o=example" "entryUUID" requested
<= root access granted
<= test_filter 5
bdb_search: 2 does not match filter
entry_decode: "ou=muc,o=example"
<= entry_decode(ou=muc,o=exmaple)
=> bdb_dn2id("ou=muc,o=example")
<= bdb_dn2id: got id=0x00000003
=> test_filter
    EQUALITY
--snap--

This search seems to go on forever. At the end of the database tree the
search simply starts all over again with rootdn. If this is expected
behaviour, there should be a hint in the documentation that it's not only
advisable to index entryCSN & entryUUID for better performance but may also
be vital for keeping databases synchronized with syncrepl.

3) restarting provider during persist phase while sync operations are being
passed on from provider to consumer: consumer resumes sync operations after
provider restart: OK
4) restarting consumer during persist phase while sync operations are being
passed on from provider to consumer: it works for adding entries, not for
deleting entries, though: only the transmitted changes up to the
consumer-stop are processed; the remaining changes seem "lost".
How to reproduce: provider and consumer hold synchronized databases;
syncrepl runs in refreshAndPersist mode; start deleting a branch in the
provider's database; while the provider is still busy deleting the dn and
all its children, restart the consumer.
Can that be resolved as well?


Matthias Halbritter
eMail: matthias.halbritter@lfst.bayern.de