[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5432) syncrepl seg. faults if received a sync cookie with an empty csn= value

On Wed, 19 Mar 2008, Howard Chu wrote:

>> syncprov_playlog() in syncprov.c generates such invalid cookies when
>> replaying a
>> sessionlog where no entries has been deleted.  The patch also fixes this
>> bug.
> This should never happen, since the cookie is not generated if there were no
> deletes recorded.

Well, it do happen anyhow.  Using the 2.4.8 release this can easily be
reproduced with a configuration where the syncrepl provider (with
sessionlogs enabled) is on a glue backend, the consumer replicates the
entire glue subtree.  Assuming both servers are in sync and stopped,
start the producer, modify an entry in the subordinate db and then start
the consumer.  The producer will send a cookie with an empty csn= value
as in this sync debug output:

slapd starting
slap_queue_csn: queing 0x418001a0 20080320125116.487494Z#000000#000#000000
slap_graduate_commit_csn: removing 0x8d6ae0 20080320125116.487494Z#000000#000#000000
srs csn 20080320124421.642686Z#000000#000#000000
log csn 20080320125116.487494Z#000000#000#000000
syncprov_playlog: cookie=rid=007,csn=
syncprov_search_response: cookie=rid=007,csn=20080320125116.487494Z#000000#000#000000

I have also tested it using the cvs head source, and the good news is
that this was accidentally fixed (at least partly) by the ITS#5434
change. I.e it was caused by the be_search() in syncprov_playlog() not
being propagated to the subordinate database and thereby not finding the
entry that was modified.

The bad news is that there is still a race condition that would have the
same effect.  If someone deletes the entry that was modified just before
be_search() is done it will return the same empty cookie.  I was able to
reproduce this in a debugger by setting a break point before be_search(),
deleting the entry with ldapdelete and in the debugger run next over the
be_search() call before continuing.

So, I still think it is best to keep delcsn[0].bv_val NULL until an
actual cookie has been written to it, as slap_compose_sync_cookie() will
create a cookie with an empty csn= if passed a csn array where bv_len is
zero but bv_val is non-NULL.  Although an alternative and maybe more
correct fix could be to change slap_compose_sync_cookie() instead?