8444 – Out-of-sync issue with memberOf overlay, Delta-syncrepl MMR and >2 nodes

Issue 8444 - Out-of-sync issue with memberOf overlay, Delta-syncrepl MMR and >2 nodes

Summary: Out-of-sync issue with memberOf overlay, Delta-syncrepl MMR and >2 nodes

Status:	VERIFIED FIXED

Alias:	None

Product:	OpenLDAP
Classification:	Unclassified
Component:	slapd (show other issues)
Version:	unspecified
Hardware:	All All

Importance:	--- normal
Target Milestone:	---
Assignee:	OpenLDAP project

URL:
Keywords:

Duplicates (1):	8396 (view as issue list)
Depends on:
Blocks:

Reported:	2016-06-16 12:33 UTC by mark.cairney@ed.ac.uk
Modified:	2020-03-22 08:16 UTC (History)
CC List:	1 user (show)

See Also:	8148

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description mark.cairney@ed.ac.uk 2016-06-16 12:33:25 UTC

Full_Name: Mark Cairney
Version: 2.4.44 (git_master)
OS: Centos 6 x86_64
URL: http://homepages.ed.ac.uk/mcairney/delta-mmr-memberof.tar.gz
Submission from: (NULL) (129.215.201.136)


Hi,

I think I've identified an issue where if you have more than 2 OpenLDAP servers
set up in MMR using delta-syncrepl and push-based replication
(refreshAndPersist) and the memberOf overlay.
If you add/delete objects which are maintained by the memberOf overlay i.e.
users and groups delta-syncrepl goes out-of-sync and triggers a SYNC_ID_SET
refresh on the consumers. It looks like it receives the same change from all
other nodes. This behaviour doesn't appear to happen when using pull-based
replication with delta-syncrepl or standard syncrepl (push and pull-based).

To confirm this behaviour I removed the memberOf overlay from the config, dumped
the database, removed any "memberOf" attributes and put the database back in
place (slapcat/slapadd on both cn=config and my main DB) on all 3 nodes. Without
the memberof overlay loaded sync behaved normally on all 3 nodes.

In the tarball I've included a sanitised copy of my config (Schema, ACLs,
password hashes, indices for non-standard attributes, operational attributes all
removed), a log entry showing deletion of an object not managed by memberOf (a
netgroup entry), and log entries showing what happens when a group is deleted
both with and without the memberOf overlay.

The 3 servers are identical in both config and hardware spec apart from the
olcSASLHost. They are Sun X4150's with 24GB RAM, 10KRPM HDDs in 2 pairs of
RAID-1 and 2x4 CPU cores (Xeon E5540 @2.53GHz).

Comment 1 henson@acm.org 2016-12-23 02:17:00 UTC

I sent this to the mailing list and was advised to add it to ITS 8444,
so here it is:

I recently updated to openldap 2.4.44 with the patch for ITS 8432. I'd been
holding off in hopes of a new release with that patch included and for
some update on ITS 8444, but decided to go ahead and push it through
during the holiday break.

I installed it on my dev environment and was unable to replicate the
issues reported in ITS 8444 so went ahead and rolled into production.
However, now that I'm in production, I'm seeing different issues with
the memberOf overlay. It seems for some reason group membership
deletions are getting replicated multiple times? In the logs, I will see
something like the following:


Dec 21 04:16:59 themis slapd[2607]: conn=364875 op=227806 MOD dn="uid=members,ou=group,dc=cpp,dc=edu"
Dec 21 04:16:59 themis slapd[2607]: conn=364875 op=227806 MOD attr=member

My identity management system connected and removed some members from
this group. Next, there will be a number of lines like this:


Dec 21 04:17:15 themis slapd[2607]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=cpp,dc=edu" delete memberOf="uid=members,ou=group,dc=cpp,dc=edu" failed err=16
Dec 21 04:17:16 themis slapd[2607]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=cpp,dc=edu" delete memberOf="uid=members,ou=group,dc=cpp,dc=edu" failed err=16
Dec 21 04:17:17 themis slapd[2607]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=cpp,dc=edu" delete memberOf="uid=members,ou=group,dc=cpp,dc=edu" failed err=16
Dec 21 04:17:18 themis slapd[2607]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=cpp,dc=edu" delete memberOf="uid=members,ou=group,dc=cpp,dc=edu" failed err=16
Dec 21 04:17:18 themis slapd[2607]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=cpp,dc=edu" delete memberOf="uid=members,ou=group,dc=cpp,dc=edu" failed err=16
Dec 21 04:17:18 themis slapd[2607]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=c

Where the memberOf overlay is complaining that it can't remove the
corresponding memberOf attribute from the user. Reviewing the
accesslog, we see:

dn: reqStart=20161221121659.000002Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121659.000002Z
reqEnd: 20161221121706.000000Z
reqType: modify
reqSession: 364875
reqAuthzID: cn=idmgmt,ou=user,ou=service,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

The initial tranaction from my client, authenticated as cn=idmgmt. This
one is successful and the overlay deletes the memberOf attribute.
However, there are then *six* more instances of the same transaction:

dn: reqStart=20161221121714.000001Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121714.000001Z
reqEnd: 20161221121715.000000Z
reqType: modify
reqSession: 3
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

dn: reqStart=20161221121716.000001Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121716.000001Z
reqEnd: 20161221121716.000002Z
reqType: modify
reqSession: 1
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

dn: reqStart=20161221121716.000004Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121716.000004Z
reqEnd: 20161221121717.000002Z
reqType: modify
reqSession: 1
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

dn: reqStart=20161221121717.000001Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121717.000001Z
reqEnd: 20161221121718.000000Z
reqType: modify
reqSession: 2
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

dn: reqStart=20161221121718.000001Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121718.000001Z
reqEnd: 20161221121718.000002Z
reqType: modify
reqSession: 2
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

dn: reqStart=20161221121718.000003Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121718.000003Z
reqEnd: 20161221121718.000004Z
reqType: modify
reqSession: 2
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

These are are performed by cn=ldaproot, and correspond to the six error
messages from the overlay about failing to remove the attribute (which
was already removed).

Similarly, on a replica:

Dec 21 04:17:16 coeus slapd[2841]: conn=-1 op=0: memberof_value_modify DN="uid=prsloan,ou=user,dc=cpp,dc=edu" delete memberOf="uid=members,ou=group,dc=cpp,dc=edu" failed err=16

There is one error, because there are two copies of the transaction:

dn: reqStart=20161221121706.000001Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121706.000001Z
reqEnd: 20161221121716.000000Z
reqType: modify
reqSession: 1
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

dn: reqStart=20161221121714.000002Z,cn=accesslog
objectClass: auditModify
structuralObjectClass: auditModify
reqStart: 20161221121714.000002Z
reqEnd: 20161221121716.000001Z
reqType: modify
reqSession: 3
reqAuthzID: cn=ldaproot,dc=cpp,dc=edu
reqDN: uid=members,ou=group,dc=cpp,dc=edu
reqResult: 0
reqMod: member:- uid=jjtringali,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=kaijulee,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=srknight,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ppimentel,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=epdetering,ou=user,dc=cpp,dc=edu
reqMod: member:- uid=ktran,ou=user,dc=cpp,dc=edu
[...]
reqMod: member:- uid=prsloan,ou=user,dc=cpp,dc=edu

So far, I've only seem this behavior for group membership removals. Adds
don't seem to cause a problem, nor do create/delete of groups or users
as far as I can tell.

Any thoughts on what's going on here? It's not causing any failures yet,
as removing members multiple times results in the same end state and the
multiple replication seems to have a fairly low upper bound. But it
would be nice to fix it :).

Thanks...

Comment 2 Quanah Gibson-Mount 2016-12-29 00:36:12 UTC

moved from Incoming to Software Bugs

Comment 3 Quanah Gibson-Mount 2017-04-03 18:15:17 UTC

--On Friday, December 23, 2016 2:17 AM +0000 henson@acm.org wrote:

> I sent this to the mailing list and was advised to add it to ITS 8444,
> so here it is:

Looks like this has actually been reported before: 
<http://www.openldap.org/its/index.cgi/?findid=7400>

--Quanah



--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>

Comment 4 Quanah Gibson-Mount 2017-04-03 18:15:52 UTC

changed notes

Comment 5 henson@acm.org 2017-04-06 03:58:01 UTC

On Mon, Apr 03, 2017 at 11:15:17AM -0700, Quanah Gibson-Mount wrote:

> Looks like this has actually been reported before: 
> <http://www.openldap.org/its/index.cgi/?findid=7400>

Yikes, from 2012, and never resolved 8-/. I'm curious though why Howard
says the work has to be distributed to all the replicas locally as it
wouldn't scale otherwise? For adding or removing a user from a group,
replicating the memberOf attribute on the user in addition to the member
attribute on the group only doubles the replication load which doesn't
seem excessive. I suppose if you delete a group with a large membership
that would result in updating a large number of users, but that's no
different than if a delete a single user from a large number of groups,
which doesn't seem to cause any issues. Unless I'm misunderstanding
something?

In any case, I suppose this ITS can be closed now too, with the same
resolution that the memberof overlay is no longer supported in a
replicated configuration and to switch to the dynlist implementation...

Comment 6 Ondřej Kuzník 2017-06-05 15:39:33 UTC

Although Quanah hasn't been able to trigger this yet, the regression
test in master he wrote has been consistently able to trigger for me on
my machine, so I've started to investigate.

For posterity and in case anyone is interested, I have uploaded the
testrun/ directory from a failing run (I suspect the fact of this laptop
having a slow 2-core CPU helps) with a slightly patched slapd that
records the thread ID as well since, in part, this seems like a race of
some sort.

The tgz is available at
ftp://ftp.openldap.org/incoming/its8444-regression-testrun-sync,stat.tgz

So far it looks like replica #3's threads 7f50fb7fe700 and 7f51017bb700
are both trying to apply the modification with CSN
20170605125334.856475Z#000000#001#000000 which sends it into a full
refresh. 

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 7 Ondřej Kuzník 2017-06-08 17:36:03 UTC

A more self-contained log of the same issue, available at
ftp://ftp.openldap.org/incoming/its8444.log

(line numbers below are against current master, commit
91f4d3a6b75e73bf4ea498e83e2e4cb4e7a320e0)

There are some things that occur in all the failures I have seen so far:
- the server that received the operation (#1) sends the accesslog entry
  with no CSN in the cookie, then another provider (#2) picks up this
  message and relays it to its consumers, this one with a CSN in the
  cookie
- a consumer picks up these two in short succession, in the log above,
  processing of the one from #2 is finished first (they are being
  processed concurrently)

Usually, once one of them gets processed, the new CSN should be noted
and the other threads should just skip it (syncrepl.c:943 and onwards).
In this one, having no CSN in the cookie seems to allow both to process
so far as to run syncrepl_message_to_op(), and one of them will then
inevitably fail to add the entry.

I don't understand yet why server #1 sends the operations without a CSN
in the cookie and (especially if I reorder the overlays to set up
memberof last), the race goes the other way around and the operation to
fail is the one from server #2.

My take on it was that in a delta-sync environment all entries would be
passed with a new CSN and that should end up in the cookie, allowing
syncrepl.c:986 to do its job.

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 8 Ondřej Kuzník 2017-07-18 16:27:40 UTC

On Thu, Jun 08, 2017 at 06:36:02PM +0100, Ondřej Kuzník wrote:
> A more self-contained log of the same issue, available at
> ftp://ftp.openldap.org/incoming/its8444.log
> 
> (line numbers below are against current master, commit
> 91f4d3a6b75e73bf4ea498e83e2e4cb4e7a320e0)
> 
> There are some things that occur in all the failures I have seen so far:
> - the server that received the operation (#1) sends the accesslog entry
>   with no CSN in the cookie, then another provider (#2) picks up this
>   message and relays it to its consumers, this one with a CSN in the
>   cookie
> - a consumer picks up these two in short succession, in the log above,
>   processing of the one from #2 is finished first (they are being
>   processed concurrently)
> 
> Usually, once one of them gets processed, the new CSN should be noted
> and the other threads should just skip it (syncrepl.c:943 and onwards).
> In this one, having no CSN in the cookie seems to allow both to process
> so far as to run syncrepl_message_to_op(), and one of them will then
> inevitably fail to add the entry.
> 
> I don't understand yet why server #1 sends the operations without a CSN
> in the cookie and (especially if I reorder the overlays to set up
> memberof last), the race goes the other way around and the operation to
> fail is the one from server #2.
> 
> My take on it was that in a delta-sync environment all entries would be
> passed with a new CSN and that should end up in the cookie, allowing
> syncrepl.c:986 to do its job.

Using the behaviour above, I have been able to trigger a desync, the
script and testrun directory from it happening are available here:
ftp://ftp.openldap.org/incoming/its8444-desync.tgz

In srv3, looking at cn=accesslog, we can see that the increment by 2 has
been applied (and logged) twice with the same entryCSN, as:
reqStart=20170718155142.000007Z,cn=accesslog
reqStart=20170718155142.000009Z,cn=accesslog

also seen in the log around those two DNs above.

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 9 Ondřej Kuzník 2017-07-19 10:44:35 UTC

On Tue, Jul 18, 2017 at 05:27:40PM +0100, Ondřej Kuzník wrote:
> Using the behaviour above, I have been able to trigger a desync, the
> script and testrun directory from it happening are available here:
> ftp://ftp.openldap.org/incoming/its8444-desync.tgz

It's super easy to reproduce the desync when upping MMR to 6 in that
script even without memberof (no process re-nicing necessary actually).

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 10 Ondřej Kuzník 2017-08-23 13:42:30 UTC

It is caused by the cookie not containing CSN and a race between the
syncCookie check in do_syncrep2 and syncrepl_message_to_op.

This race is probably fine with plain syncrepl which is idempotent, but
deltasync changes get their own dn in each accesslog instance and some
can be applied twice unless we know how to find out we've already seen
them - they need to mention the CSN.

The CSN itself gets lost on at least one occasion - when there's a
checkpoint triggered. Not 100 % sure why the cookie gets eaten because
of it, the op pointer is different between the syncprov_op_response that
calls syncprov_checkpoint and the one that decides CSN hasn't changed.

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 11 Ondřej Kuzník 2017-08-24 14:55:32 UTC

On Wed, Aug 23, 2017 at 02:42:29PM +0100, Ondřej Kuzník wrote:
> It is caused by the cookie not containing CSN and a race between the
> syncCookie check in do_syncrep2 and syncrepl_message_to_op.
> 
> This race is probably fine with plain syncrepl which is idempotent, but
> deltasync changes get their own dn in each accesslog instance and some
> can be applied twice unless we know how to find out we've already seen
> them - they need to mention the CSN.
> 
> The CSN itself gets lost on at least one occasion - when there's a
> checkpoint triggered. Not 100 % sure why the cookie gets eaten because
> of it, the op pointer is different between the syncprov_op_response that
> calls syncprov_checkpoint and the one that decides CSN hasn't changed.

Yes, whenever a checkpoint happens, the syncCookie in cn=accesslog only
contains rid=XXX,sid=YYY. I thought that was because the checkpoint
results in a new accesslog entry and that would be transmitted first,
but that's not the case, there is no accesslog entry nor anything sent
to the client (as observed by ldapsearch -E sync=rp).

I think it looks like this: syncproc_checkpoint modifies the suffix
entry, that calls slap_graduate_commit_csn and the csn is removed from
be_pending_csn_list. accesslog_response then can't find the CSN there
and has nothing to insert into its own pending csn list. Strange that
changing the overlay order (accesslog vs. syncprov) doesn't change this
behaviour, something I'd expect if the above is the reason this happens.

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 12 Ondřej Kuzník 2017-08-25 15:40:19 UTC

On Thu, Aug 24, 2017 at 03:55:32PM +0100, Ondřej Kuzník wrote:
> I think it looks like this: syncproc_checkpoint modifies the suffix
> entry, that calls slap_graduate_commit_csn and the csn is removed from
> be_pending_csn_list. accesslog_response then can't find the CSN there
> and has nothing to insert into its own pending csn list. Strange that
> changing the overlay order (accesslog vs. syncprov) doesn't change this
> behaviour, something I'd expect if the above is the reason this happens.

In any case, having the checkpoint modification get its own CSN entry
lets the main operation CSN entry persist, which fixes this issue and
all operations that need a CSN seem to get it.

Patch is available at:
ftp://ftp.openldap.org/incoming/Ondrej-Kuznik-20170825-ITS8444.patch

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Comment 13 Quanah Gibson-Mount 2017-08-30 20:51:34 UTC

changed notes
changed state Open to Test

Comment 14 Quanah Gibson-Mount 2017-09-12 23:40:25 UTC

changed notes
changed state Test to Release

Comment 15 OpenLDAP project 2018-03-22 19:25:10 UTC

Fixed in master (9e156bf91467da256ade7842b03b0aa6f9b7405c)
Fixed in RE24 (2.4.46)

Comment 16 Quanah Gibson-Mount 2018-03-22 19:25:10 UTC

changed notes
changed state Release to Closed

Comment 17 Quanah Gibson-Mount 2020-03-22 08:16:04 UTC

*** Issue 8396 has been marked as a duplicate of this issue. ***