[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6059) Abandon syncprov race condition?



rein@OpenLDAP.org writes:
> I've had two cases where a delete operation was performed on the
> master without being replicated to its consumers, which so far appear
> to be cases of possible connection lost (abandon) race conditions.

Not sure if this is the problem, but it is ugly: slapd/cancel.c sets
o_abandon with op->o_conn->c_mutex locked, but waits to set o_cancel
after it's unlocked.  Looks like that can give slapd a chance to react
to o_abandon before it "knows" that abandon is actually a cancel.

> Do we need some sort of o_committed flag that can be used to prevent
> o_abandon from being set or acted upon? Or handle o_abandon more like
> o_cancel, i.e with multiple values, including "too late"?

o_cancel is a wrapper around o_abandon, turning result code
SLAPD_ABANDON into LDAP_TOO_LATE etc.  However slap_send_ldap_result()
and send_ldap_response() skip "if (op->o_callback) slap_response_play()"
if o_abandon is set, and "send" SLAPD_ABANDON instead of the result
code.  Can that work right?  The code looks like SLAPD_ABANDON ought to
mean "nothing was done" right up till everything has had a chance to
react the same way to an operation.

-- 
Hallvard