Issue 9742 - deltasync replication after initial slapadd
Summary: deltasync replication after initial slapadd
Status: VERIFIED PARTIAL
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: overlays (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: 2.5.10
Assignee: Ondřej Kuzník
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-08 14:55 UTC by Ondřej Kuzník
Modified: 2022-01-27 08:35 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Ondřej Kuzník 2021-11-08 14:55:15 UTC
If setting up a new delta-MPR environment such that:
- server A has been slapadd-ed the seed data
- accesslog is set to "logops all"

There is a sequence of events where servers are started and replicate from each other before they get a chance to talk to A, they will each generate a CSN and replicate it. Once they start talking to A, it will see that it has a CSN (its own) that none of them have sent and "syncprov-nopresent" makes it just go ahead where the only sane outcome is to send SYNC_REFRESH_REQUIRED.

Am I missing a usecase for this or should it just have been this way all along?
Comment 1 Ondřej Kuzník 2021-11-09 10:29:02 UTC
I guess running without that option wouldn't have helped and the same would have happened anyway since the log is just missing things.

Seems like adding a new serverID into a deltasync cluster is an administrative task that should be documented and managed. We have two options there:
- we have syncprov return a SYNC_REFRESH_REQUIRED, triggering a fallback sync and tainting everyone's accesslog
- we mandate that the initial environment set up is done very carefully (the seed is contacted first by everyone or distributed ahead of time) and keep the current behaviour
Comment 2 Quanah Gibson-Mount 2021-11-09 21:47:55 UTC
(In reply to Ondřej Kuzník from comment #1)
> I guess running without that option wouldn't have helped and the same would
> have happened anyway since the log is just missing things.
> 
> Seems like adding a new serverID into a deltasync cluster is an
> administrative task that should be documented and managed. We have two
> options there:
> - we have syncprov return a SYNC_REFRESH_REQUIRED, triggering a fallback
> sync and tainting everyone's accesslog
> - we mandate that the initial environment set up is done very carefully (the
> seed is contacted first by everyone or distributed ahead of time) and keep
> the current behaviour


I consider it essential that it be trivial to add new nodes to an MPR cluster at any time w/o it requiring any special handling.  Replication is already insanely complex in OpenLDAP.
Comment 3 Ondřej Kuzník 2021-11-10 14:19:54 UTC
I'm inclined to making it reject that op, the problem with deltasync is these transitions wreak havoc on the accesslog. When we finally understand it, the exact scope of the damage and how to deal with it, we might need to be able to interfere with other client sessions - and how not to keep triggering this forever if we want to keep delta-MPR viable.
Comment 4 Howard Chu 2021-12-07 14:47:11 UTC
(In reply to Ondřej Kuzník from comment #0)
> If setting up a new delta-MPR environment such that:
> - server A has been slapadd-ed the seed data
> - accesslog is set to "logops all"
> 
> There is a sequence of events where servers are started and replicate from
> each other before they get a chance to talk to A, they will each generate a
> CSN and replicate it. Once they start talking to A, it will see that it has
> a CSN (its own) that none of them have sent and "syncprov-nopresent" makes
> it just go ahead where the only sane outcome is to send
> SYNC_REFRESH_REQUIRED.
> 
> Am I missing a usecase for this or should it just have been this way all
> along?

Sounds like the seed data should include the accesslog context entry.
Comment 5 Quanah Gibson-Mount 2021-12-14 16:36:06 UTC
  • c51320a6 
by Ondřej Kuzník at 2021-12-13T19:20:58+00:00 
ITS#9742 Reject a refresh if we can't do a precise resync
Comment 6 Quanah Gibson-Mount 2021-12-14 16:49:38 UTC
RE26:

  • 60c92687 
by Ondřej Kuzník at 2021-12-14T16:46:40+00:00 
ITS#9742 Reject a refresh if we can't do a precise resync
Comment 7 Quanah Gibson-Mount 2021-12-14 16:51:47 UTC
  • 52b89f52 
by Ondřej Kuzník at 2021-12-14T16:50:11+00:00 
ITS#9742 Reject a refresh if we can't do a precise resync
Comment 8 Quanah Gibson-Mount 2021-12-14 16:52:14 UTC
RE25:


(In reply to Quanah Gibson-Mount from comment #7)
>   • 52b89f52 
> by Ondřej Kuzník at 2021-12-14T16:50:11+00:00 
> ITS#9742 Reject a refresh if we can't do a precise resync