[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#6138) Bad Cancel/Abandon/"internal abandon"/Syncprov interactions



Full_Name: Hallvard B Furuseth
Version: HEAD
OS: 
URL: 
Submission from: (NULL) (129.240.6.233)
Submitted by: hallvard


Might make this a tracking issue for Cancel/Abandon problems.
And/or copy some to separate ITSes, but they all seem interconnected:


Cancel(abandoned operation)-requests are not rejected.  Thus slapd
sets o_cancel and turns an active Abandon into a Cancel.  Presumably
that can confuse Cancel/Abandon handlers, like that in Syncprov.

Similarly, Abandon(abandoned/cancelled operation) is not ignored.
connection_abandon() re-abandons abandoned/cancelled operations too.


However Syncprov:RefreshAndPersist abuses op->o_abandon: It sets it to
mean "Suppress the response.  A copy of this operation will send it."
So if Cancel(op with o_abandon!=0) is fixed to respond protocolError
"already abandoned", presumably that breaks Cancel(RefreshAndPersist).

I'm not touching it with a flagpole - I don't know syncprov.  Help?

Overlay retcode does something similar - sends a response and then sets
o_abandon.

Cancel/Abandon can in any case fail by targeting the wrong operation
though: A connection can have multiple messages with the same IDs when
the response is sent and the client reuses the message ID, before the
old operation in slapd can clean up and finish.

syncprov_op_abandon() identifies messages by (connid, msgid).  Can
multiple messages with the same ID break that, or more importantly,
break what gets sent/written with syncrepl?


Anyway, the current code needs an o_abandon value which means "suppress
response".  Or maybe "abandoned, except as far as future Abandon and
Cancel operations are concerned".  Syncprov and Retcode need to handle
the various possible orderings of Cancel/Abandon vs Suppress, including
when they forward o_abandon from one operation to another.  I haven't
looked too closely at that code either.

ITS#6104 (race condition with cancel operation) also calls for multiple
o_abandon values (joining some or all of o_cancel into o_abandon), and
must be considered when picking a value.  However I suggest to consider
just the various possible values first and get that right, and deal
with the race condition later.

Source and binary compatibility with third-party modules complicates
this, if we're trying to support both and old compiled slapd + new
compiled module and vice versa.  Might try a minimal solution now.