Issue 6476 - Failure to delete entry with multi-master replication
Summary: Failure to delete entry with multi-master replication
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.21
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-02-17 13:05 UTC by worganc@avaya.com
Modified: 2017-03-28 00:22 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description worganc@avaya.com 2010-02-17 13:05:40 UTC
Full_Name: Craig Worgan
Version: 2.4.21
OS: RedHat Enterprise Linux 5
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (47.11.182.86)


As per Kyle Blaney's post to the bugs mailing list (Kyle is n vacation so I am
submitting this ITS on his behalf):

I have encountered a situation with multi-master replication in OpenLDAP
2.4.21 where an entry deleted on one server is not deleted from its
peer.  I'm using Redhat Enterprise Linux 5.

Here's what I did:
1. Configure Network Time Protocol with server A as the NTP master and
server B as the NTP slave.
2. Configure multi-master replication between server A (server ID=1) and
server B (server ID=2).
3. Start OpenLDAP service on servers A and B.
4. Add an entry to server A and ensure it's replicated to server B.
5. Add an entry to server B and ensure it's replicated to server A.
6. Stop OpenLDAP service on server A.
7. Delete an entry on server B.
8. Start OpenLDAP service on server A with sync debugging enabled (-d
sync).

At this point, I expected that the entry deleted from server B would be
deleted from server A.  Instead, the entry remained on server A and
slapd displayed the following (with the entry's DN X'ed out):

slapd starting
do_syncrep2: rid=001 LDAP_RES_INTERMEDIATE - REFRESH_DELETE
Entry XXXXXX CSN 20100209193028.621799Z#000000#001#000000 older or equal
to ctx 20100209193028.621799Z#000000#001#000000
syncprov_search_response:
cookie=rid=001,sid=001,csn=20100202210831.101462Z#000000#000#000000;2010
0209193028.621799Z#000000#001#000000;20100209193118.342038Z#000000#002#0
00000

Why wouldn't the entry deleted on server B also deleted from server A?
Is the failure to delete the entry related to the "entry CSN older or
equal to context CSN" message?

Unfortunately, I have been unable to reproduce the failure since I first
saw it.  All subsequent tests have shown that the entry deleted from
server B is deleted from server A when the OpenLDAP service on server A
is restarted.

Kyle Blaney

-------------------------------------------------------------------------

There is also this follow up post:
I can now reliably reproduce this problem in 2.4.21 and 2.4.20 on a
multi-master setup that only has five entries:
1. Stop service on server A.
2. Delete one entry on server B.
3. Start service on server A.

After step 3, the entry is never deleted from server A.

I have changed many aspects of configuration (replication with and
without TLS, syncprov-sessionlog enabled and disabled,
syncprov-checkpoint enabled and disabled, syncprov-nopresent TRUE and
FALSE, syncprov-reloadhint TRUE and FALSE) and the problem still occurs.

Is this problem the same one outlined in "test 058 failure" at
http://www.openldap.org/lists/openldap-software/201002/msg00031.html?

Kyle
 ----------------------------------------------------------------------

Thanks,

Craig Worgan
Comment 1 Quanah Gibson-Mount 2010-07-22 11:01:25 UTC
changed notes
Comment 2 Quanah Gibson-Mount 2010-07-22 18:00:09 UTC
--On Wednesday, February 17, 2010 1:05 PM +0000 worganc@avaya.com wrote:

> Full_Name: Craig Worgan
> Version: 2.4.21
> OS: RedHat Enterprise Linux 5
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (47.11.182.86)

Hi Craig,

Are you able to reproduce this with OpenLDAP 2.4.23?  There have been some 
fixes around deletes since 2.4.21, and an issue that was introduced in 
2.4.21 specific to cached objects as well.

Thanks,
Quanah




--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc
--------------------
Zimbra ::  the leader in open source messaging and collaboration

Comment 3 piotr.tarnowski@us.edu.pl 2010-12-13 12:27:43 UTC
Hi!

We have observed similiar symptoms with OpenLDAP 2.4.23:

2010-12-10T14:56:45+01:00 ldapst2/192.168.10.48 slapd[29363]: [ID 502419
local4.debug] Entry uid=Replicator,dc=us,dc=edu,dc=pl CSN 2
0101208130342.415481Z#000000#008#000000 older or equal to ctx
20101208130342.415481Z#000000#008#000000
2010-12-10T14:56:45+01:00 ldapst2/192.168.10.48 slapd[29363]: [ID 653192
local4.debug] syncprov_search_response: cookie=rid=008,sid=
008,csn=20101202144622.962136Z#000000#007#000000;20101001125631.576652Z#000000#003#000000;20101210135344.096897Z#000000#008#000000

do you have any fixes?

Regards
-- 
[ Piotr Tarnowski                   piotr.tarnowski@us.edu.pl ]
[ Unix Administrator, University of Silesia, Katowice, Poland ]

Comment 4 Quanah Gibson-Mount 2017-03-28 00:15:16 UTC
Hello,

Can you report back on whether or not you still see this behavior with 
current OpenLDAP 2.4.44?  Significant fixes (Such as for ITS#8281) have 
occurred since this report for 2.4.23.

Thanks!

--Quanah

--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 5 Quanah Gibson-Mount 2017-03-28 00:15:43 UTC
changed notes
changed state Open to Feedback
moved from Incoming to Software Bugs
Comment 6 OpenLDAP project 2017-03-28 00:22:39 UTC
Same as ITS#6555?
Same as ITS#8281?

Closed -- Neither reporter available.  Likely fixed.
Comment 7 Quanah Gibson-Mount 2017-03-28 00:22:39 UTC
changed notes
changed state Feedback to Closed