[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#8768) Syncprov shouldn't send a new cookie at the end of delete phase



On Thu, Nov 02, 2017 at 03:55:02PM +0000, ondra@mistotebe.net wrote:
> On Tue, Oct 31, 2017 at 05:34:05PM +0000, ondra@openldap.org wrote:
> > If sessionlog data is available and found useful, syncprov will send a cookie at
> > the end of delete phase, itself followed by the entries modified since time
> > recorded in the client's original cookie.
> > 
> > Some of those entries might have been last modified before the new cookie's
> > recorded time and if the connection is severed before this is communicated, they
> > would not be re-sent under the new cookie.
> 
> There are other problems with this. I have always assumed that CSN of
> each write is globally unique in a well-configured system and that this
> is preserved across replication, since MMR needs that to function
> properly. This assumption is clearly invalid if UUIDs are sent in a
> delete SyncInfo message (consumer that needs to determine CSNs that
> apply can only pick a single CSN for all of the deletes).
> 
> So this is a problem in MMR situations where the cookie carries semantic
> information between MMR nodes.
> 
> An MMR member receiving such a message has to pick a CSN to apply here:
> - either the cookie (if present at all) - leads to problems described
>   above
> - or some other CSN - the deletes could be lost or propagate to other
>   masters as a fresh mod, either smells of replication problems down the
>   line

Even assuming we never send a batch delete, sessionlog is a problem in
the MMR case:
- to end up in the sessionlog, we need a CSN for the delete to be
  transmitted
- if we send all deletes first, then modified entries, we can't use the
  cookie to send the information that's needed to create a sessionlog
  entry
- we can't reasonably send all entries in CSN order with the deletes
  interspersed at the relevant place in the stream, that would require
  holding onto the (UUID, CSN) list for a very long time, not to mention
  that we'd need the backend to guarantee search entry ordering in the
  first place

Maybe if we track more information in the cookie
MMR nodes might have what they needed to populate sessionlog and
standards compliant syncrepl clients would keep working as well?

Maybe storing progress of the delete phase (optional) and general
replication progress (CSN as usual) in the cookie might do it.
Question is whether that is enough, doesn't introduce new problems and
doesn't make the code even more complex and harder to maintain?

That, even if workable wouldn't get it into 2.4, so an upgrade would
have to be a lock-step affair for at least the masters/nodes with
syncprov running.

> This shouldn't affect deltaMMR environments, though, AFAIK they never use
> sessionlog in any way, so batched deletes don't get sent over the wire
> at all.

Quanah mentions that deltaMMR can hit this issue if we have to fall-back
to plain syncrepl in the case of a conflict (and we don't want a full
present phase to happen at that point).

-- 
OndÅ?ej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP