[Date Prev][Date Next]
Re: (ITS#6059) Abandon syncprov race condition?
> Full_Name: Rein Tollevik
> Version: 2.4.16
> OS: linux
> Submission from: (NULL) (220.127.116.11)
> Submitted by: rein
> I've had two cases where a delete operation was performed on the master without
> being replicated to its consumers, which so far appear to be cases of possible
> connection lost (abandon) race conditions. The log (level: stats) shows the
> "DEL" message of the entry, immediately followed by a "closed (connection lost)"
> message on the connection. Note: No "RESULT" message was logged.
> I haven't looked very much into this, but my theory so far is that syncprov
> skipped replicating of the delete op after noticing the abandon resulting from
> loosing the connection, even though the delete had already taken place in the
> local database. That it happened after a delete op might very well have been a
> coincident, this possible race could exist after any modify op for all I know.
> Do we need some sort of o_committed flag that can be used to prevent o_abandon
> from being set or acted upon? Or handle o_abandon more like o_cancel, i.e with
> multiple values, including "too late"?
No. What good can that do, since the connection has already been lost?
It doesn't matter if syncprov fails to send an update to a consumer - the
consumer's cookie state will let it pick up where it left off when it reconnects.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/