Full_Name: Jose Ildefonso Camargo Tolosa Version: 2.4.13 OS: Linux URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (190.73.138.178) Hi! I just configured multi-master replication (with N=3) for testing purposes, and I just found an annoying problem, under these conditions: 1. Configure a N number of "masters", and have them replicate happily (this is important). 2. Stop slapd service on all of the servers. 3. Start slapd service on any number of servers < N (ie, leave at least one stopped). 4. Delete any entry (yes, it only fails with deletes, you can combine changes: modify, add, and delete, and only deletes will fail to replicate). 5. Stop slapd service on *all of the servers again* (this is the most important part). 6. Start slapd service on all the servers. 7. Look for the deleted entry on the server(s) that you left stopped in steps 3 and 4, the entry is there!, but it isn't on the server(s) that were running in step 3 and 4. You will see that, on the servers that you left down on step 3, the deleted entries are still present. If you leave at least one server up, it replicates just fine, the problem is when you stop all of the masters, and then start them again. If you need further info, just ask! I hope this helps, Ildefonso Camargo
Hello, I might be facing the same problem. Here is what I did : I'm testing N-Way Multi-Master replication with OpenLDAP 2.4.11 & 2.4.13 I have setup 2 Masters (m1 & m2) starting form test050-syncrepl-multimaster and modifying it. Every thing seems to work fine except deleting entries. Let me explain. case 1 : . When I add an entry on m1 it is successfully replicated on m2. . When I try to delete this entry on m1, it is successfully removed from m1, but not replicated on m2. . When, I try to delete this entry on m2, it is successfully removed from m2 & m1. case 2 : . When I add an entry on m2 it is successfully replicated on m1. . When I try to delete this entry on m2, it is successfully removed from m2, but not replicated on m1. . When, I try to delete this entry on m1, it is successfully removed from m1 & m2. I don't have the same problem when I delete an attribute or update an entry. Here is how I have setup-ed my masters : m1 -config : dn: cn=config objectClass: olcGlobal cn: config olcServerID: 1 dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootPW:< file://$CONFIGPWF m2 - config : dn: cn=config objectClass: olcGlobal cn: config olcServerID: 2 dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootPW:< file://$CONFIGPWF m1 - syncprov : dn: cn=config changetype: modify replace: olcServerID olcServerID: 1 $URI1 olcServerID: 2 $URI2 dn: olcOverlay=syncprov,olcDatabase={0}config,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov dn: olcDatabase={0}config,cn=config changetype: modify add: olcSyncRepl olcSyncRepl: rid=001 provider=$URI1 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 olcSyncRepl: rid=002 provider=$URI2 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 - add: olcMirrorMode olcMirrorMode: TRUE m2 - syncrepl : dn: olcDatabase={0}config,cn=config changetype: modify add: olcSyncRepl olcSyncRepl: rid=001 provider=$URI1 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 olcSyncRepl: rid=002 provider=$URI2 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 - add: olcMirrorMode olcMirrorMode: TRUE m1 - schema : include: file://$ABS_SCHEMADIR/core.ldif include: file://$ABS_SCHEMADIR/cosine.ldif include: file://$ABS_SCHEMADIR/inetorgperson.ldif include: file://$ABS_SCHEMADIR/openldap.ldif include: file://$ABS_SCHEMADIR/nis.ldif m1 - backend : dn: olcDatabase={1}$BACKEND,cn=config objectClass: olcDatabaseConfig objectClass: olc${BACKEND}Config olcDatabase: {1}$BACKEND olcSuffix: $BASEDN olcDbDirectory: ./openldap-data olcRootDN: $MANAGERDN olcRootPW: $PASSWD olcSyncRepl: rid=004 provider=$URI1 binddn="$MANAGERDN" bindmethod=simple credentials=$PASSWD searchbase="$BASEDN" type=refreshOnly interval=$INTERVAL retry="5 5 300 5" timeout=3 olcSyncRepl: rid=005 provider=$URI2 binddn="$MANAGERDN" bindmethod=simple credentials=$PASSWD searchbase="$BASEDN" type=refreshOnly interval=$INTERVAL retry="5 5 300 5" timeout=3 olcMirrorMode: TRUE dn: olcOverlay=syncprov,olcDatabase={1}${BACKEND},cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov Did I miss something ? Adrien Futschik
Thanks a lot ! This seems to be working fine. Is this documented anywhere ? I never saw that option before. Adrien Futschik ======================================== Hello, add olcSpSessionlog to syncprov. I choose 1000 as value. Best regards Andreas
I Think I found the problem. I have to confirm this, but the "olcSpSessionlog" parameter doesn't seem to be necessary. I made a mistake in my configuration. I used "refreshOnly" mode for the backend and with "refreshOnly", consumer isn't aware when a provider deletes an entry. So I understood at least. Here is a extract from the documentation about syncrepl : "Also as a consequence of the search filter used in the syncrepl specification, it is possible for a modification to remove an entry from the replication scope even though the entry has not been deleted on the provider. Logically the entry must be deleted on the consumer but in refreshOnly mode the provider cannot detect and propagate this change without the use of the session log." Therefore I changed "refreshOnly" for "refreshAndPersist", I removed the "olcSpSessionlog" parameter, and everything seems to be working fine. At least, deletes are correctly replicated to the second master. Here is the final configuration I am using : m1 -config : dn: cn=config objectClass: olcGlobal cn: config olcServerID: 1 dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootPW:< file://$CONFIGPWF m2 - config : dn: cn=config objectClass: olcGlobal cn: config olcServerID: 2 dn: olcDatabase={0}config,cn=config objectClass: olcDatabaseConfig olcDatabase: {0}config olcRootPW:< file://$CONFIGPWF m1 - syncprov : dn: cn=config changetype: modify replace: olcServerID olcServerID: 1 $URI1 olcServerID: 2 $URI2 dn: olcOverlay=syncprov,olcDatabase={0}config,cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov dn: olcDatabase={0}config,cn=config changetype: modify add: olcSyncRepl olcSyncRepl: rid=001 provider=$URI1 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 olcSyncRepl: rid=002 provider=$URI2 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 - add: olcMirrorMode olcMirrorMode: TRUE m2 - syncrepl : dn: olcDatabase={0}config,cn=config changetype: modify add: olcSyncRepl olcSyncRepl: rid=001 provider=$URI1 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 olcSyncRepl: rid=002 provider=$URI2 binddn="cn=config" bindmethod=simple credentials=$CONFIGPW searchbase="cn=config" type=refreshAndPersist retry="5 5 300 5" timeout=3 - add: olcMirrorMode olcMirrorMode: TRUE m1 - schema : include: file://$ABS_SCHEMADIR/core.ldif include: file://$ABS_SCHEMADIR/cosine.ldif include: file://$ABS_SCHEMADIR/inetorgperson.ldif include: file://$ABS_SCHEMADIR/openldap.ldif include: file://$ABS_SCHEMADIR/nis.ldif m1 - backend : dn: olcDatabase={1}$BACKEND,cn=config objectClass: olcDatabaseConfig objectClass: olc${BACKEND}Config olcDatabase: {1}$BACKEND olcSuffix: $BASEDN olcDbDirectory: ./openldap-data olcRootDN: $MANAGERDN olcRootPW: $PASSWD olcSyncRepl: rid=004 provider=$URI1 binddn="$MANAGERDN" bindmethod=simple credentials=$PASSWD searchbase="$BASEDN" type=refreshAndPersist interval=$INTERVAL retry="5 5 300 5" timeout=3 olcSyncRepl: rid=005 provider=$URI2 binddn="$MANAGERDN" bindmethod=simple credentials=$PASSWD searchbase="$BASEDN" type=refreshAndPersist interval=$INTERVAL retry="5 5 300 5" timeout=3 olcMirrorMode: TRUE dn: olcOverlay=syncprov,olcDatabase={1}${BACKEND},cn=config changetype: add objectClass: olcOverlayConfig objectClass: olcSyncProvConfig olcOverlay: syncprov So I guess this was not a bug, but a mistake in the documentation : http://www.openldap.org/doc/admin24/replication.html#N-Way%20Multi-Master Can anyone confirm ? Adrien Futschik
> So I guess this was not a bug, but a mistake in the documentation : > http://www.openldap.org/doc/admin24/replication.html#N-Way%20Multi-Master > > Can anyone confirm ? Yes, you are correct, apologies. Ando fixed this in test050: ----------- 1.14 Sun Nov 16 22:06:30 2008 UTC; 2 months ago by ando Changed since 1.13: +28 -10 lines Diffs to 1.13 (colored diff) add indexes when supported; syncrepl on configuration should always be refreshAndPersist ----------- and I missed it for updating the docs. The update to the guide will be in the next release, 2.4.14 I'll close this ITS now. Thanks! -- Kind Regards, Gavin Henry. OpenLDAP Engineering Team. E ghenry@OpenLDAP.org Community developed LDAP software. http://www.openldap.org/project/
moved from Incoming to Documentation
changed notes changed state Open to Test
changed notes changed state Test to Open moved from Documentation to Software Bugs
adrien.futschik@atosorigin.com wrote: > I Think I found the problem. > > I have to confirm this, but the "olcSpSessionlog" parameter doesn't seem > to be necessary. > I made a mistake in my configuration. I used "refreshOnly" mode for the backend and with "refreshOnly", consumer isn't aware when a provider deletes an entry. So I understood at least. No. Replication is supposed to work regardless of mode, refreshOnly / refreshAndPersist should yield identical results. > Here is a extract from the documentation about syncrepl : "Also as a > consequence of the search filter used in the syncrepl specification, it is possible for a modification to remove an entry from the replication scope even though the entry has not been deleted on the provider. Logically the entry must be deleted on the consumer but in refreshOnly mode the provider cannot detect and propagate this change without the use of the session log." The above text is correct, but it is talking about partial replication where the syncrepl search filter causes only a subset of the entries to be replicated. That does not apply to your case because you haven't set any special filter in your syncrepl config. > So I guess this was not a bug, but a mistake in the documentation : > http://www.openldap.org/doc/admin24/replication.html#N-Way%20Multi-Master > > Can anyone confirm ? There is still a bug here. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
> No. Replication is supposed to work regardless of mode, refreshOnly / > refreshAndPersist should yield identical results. ... > There is still a bug here. You mean that the fact that N-way Multi-master doesn't replicate deletes properly when using "refreshOnly" is a bug ? But N-way Multi-master seams to be working correctly when using "refreshAndPersist". I see some reasons why using "refreshOnly" instead of "refreshAndPersist", will this be corrected in the next release of OpenLDAP ? What is recommended with OpenLDAP 2.4.11. Should we wait for the next stable release if we intend to use N-way Multi-master in production ? Best regards Adrien Futschik
ildefonso_camargo@yahoo.com wrote: > Full_Name: Jose Ildefonso Camargo Tolosa > Version: 2.4.13 > OS: Linux > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (190.73.138.178) > > > Hi! > > I just configured multi-master replication (with N=3) for testing > purposes, and I just found an annoying problem, under these > conditions: > > 1. Configure a N number of "masters", and have them replicate happily > (this is important). > 2. Stop slapd service on all of the servers. > 3. Start slapd service on any number of servers< N (ie, leave at > least one stopped). > 4. Delete any entry (yes, it only fails with deletes, you can combine > changes: modify, add, and delete, and only deletes will fail to > replicate). > 5. Stop slapd service on *all of the servers again* (this is the most > important part). > 6. Start slapd service on all the servers. > 7. Look for the deleted entry on the server(s) that you left stopped in steps 3 > and 4, the entry is there!, but it isn't on the server(s) that were running in > step 3 and 4. > > You will see that, on the servers that you left down on step 3, the > deleted entries are still present. > > If you leave at least one server up, it replicates just fine, the > problem is when you stop all of the masters, and then start them again. > > If you need further info, just ask! I repeated the steps you described but was unable to reproduce the problem. However, following the steps that Adrien provided in a followup, I was able to reproduce the situation and identify the problem. A fix is now in CVS HEAD, please test. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed notes changed state Test to Release
adrien.futschik@atosorigin.com wrote: > >> No. Replication is supposed to work regardless of mode, refreshOnly / >> refreshAndPersist should yield identical results. > ... > >> There is still a bug here. > > You mean that the fact that N-way Multi-master doesn't replicate deletes properly when using "refreshOnly" is a bug ? > > But N-way Multi-master seams to be working correctly when using "refreshAndPersist". I see some reasons why using "refreshOnly" instead of "refreshAndPersist", will this be corrected in the next release of OpenLDAP ? > > What is recommended with OpenLDAP 2.4.11. Should we wait for the next stable release if we intend to use N-way Multi-master in production ? I would wait for 2.4.14. -- Kind Regards, Gavin Henry. Managing Director. T +44 (0) 1224 279484 M +44 (0) 7930 323266 F +44 (0) 1224 824887 E ghenry@suretecsystems.com Open Source. Open Solutions(tm). http://www.suretecsystems.com/ Suretec Systems is a limited company registered in Scotland. Registered number: SC258005. Registered office: 13 Whiteley Well Place, Inverurie, Aberdeenshire, AB51 4FP. Subject to disclaimer at http://www.suretecgroup.com/disclaimer.html
changed notes changed state Release to Closed
fixed in HEAD fixed in RE24