[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5454) syncrepl refreshAndPersist stops receiving



rein@basefarm.no wrote:
> Hm, I hope I have found the race condition that causes this :-)  I'm now
> running with the patch at the end to see if that solves it, only time will
> tell..
>
> The race is that between the time selecting on the syncrepl socket is
> enabled by the call to connection_client_enable() and the release of the
> si_mutex a new message may arrive.  If so, the next call to do_syncrepl
> may fail in its attempt to trylock the mutex and no-one will re-enable
> selecting on it again.  My patch delays enabling of the socket until the
> mutex has been released, which looks safe to me.  Or can the access to
> si->si_conn without a lock be a problem?

How about just moving the enable to after the runqueue manipulation is done?

Just need to be sure that do_syncrepl() isn't entered again before si->si_conn 
gets initialized.

It also occurs to me that we probably don't even need to manipulate the slapd 
runqueue in persist mode, when si->si_conn is already set. I.e., in that case 
we can only have gotten here because of a listener event, and not because of
a runqueue schedule.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/