[Date Prev][Date Next] [Chronological] [Thread] [Top]

Replication issues with delta-syncrepl, MMR and memberOf overlay on 2.4.47



Hi,

We currently run a multi-master setup where all 4 of our servers
replicate from each other via delta-syncrepl but all our writes are
directed at a selected "master" server.

I've noticed recently that we are suffering sync issues and this
coincides with us upgrading from 2.4.46 with a patch for ITS8843 to 2.4.47.

On a "consumer" I'm seeing log entries like this:

Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_graduate_commit
_csn: removing 0xa1da880 20190605063503.192377Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_
op: rid=034 be_modify
cn=STUUG,ou=affiliations,ou=grouper2,dc=authorise,dc=ed,dc
=ac,dc=uk (0)
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_queue_csn: queu
eing 0x1e963c40 20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_
op: rid=035 tid b7ae2700
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=034
 cookie=rid=034,sid=003,csn=20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_null_callba
ck : error code 0x14
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_graduate_commit_csn: removing 0x1e963c40
20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_op: rid=035 be_modify
cn=PRPHDOPMIH3F_3,ou=PRPHDOPMIH3F,ou=pos,ou=grouper2,dc=authorise,dc=ed,dc=ac,dc=uk
(20)
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=035 delta-sync lost sync on
(reqStart=20190605063503.000020Z,cn=accesslog), switching to REFRESH
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_queue_csn: queueing 0x1d0f8100 20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_op: rid=034 tid bacef700
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_null_callback : error code 0x14
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_graduate_commit_csn: removing 0x1d0f8100
20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_op: rid=034 be_modify
cn=PRPHDOPMIH3F_3,ou=PRPHDOPMIH3F,ou=pos,ou=grouper2,dc=authorise,dc=ed,dc=ac,dc=uk
(20)
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=034 delta-sync lost sync on
(reqStart=20190605063503.000414Z,cn=accesslog), switching to REFRESH
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_queue_csn: queueing 0x1b5c87c0 20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_op: rid=036 tid b62df700
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_null_callback : error code 0x14
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
slap_graduate_commit_csn: removing 0x1b5c87c0
20190605063503.289350Z#000000#003#000000
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]:
syncrepl_message_to_op: rid=036 be_modify
cn=PRPHDOPMIH3F_3,ou=PRPHDOPMIH3F,ou=pos,ou=grouper2,dc=authorise,dc=ed,dc=ac,dc=uk
(20)
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=036 delta-sync lost sync on
(reqStart=20190605063503.000361Z,cn=accesslog), switching to REFRESH

Then a number of SYNC_ID_SET messages:

Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=035
 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=034 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=036 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=035 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=034 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=036 LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Jun  5 07:35:03 cauth04.authorise.is.ed.ac.uk slapd[18267]: do_syncrep2:
rid=035 LDAP_RES_INTERMEDIATE - SYNC_ID_SET

Before finally a whole bunch of:

Jun  5 07:35:20 cauth04.authorise.is.ed.ac.uk slapd[18267]:
nonpresent_callback: rid=035 present UUID
8573adfa-71c7-4a9c-ab64-54d6c7f51e25, dn dc=authorise,dc=ed,dc=ac,dc=uk
Jun  5 07:35:20 cauth04.authorise.is.ed.ac.uk slapd[18267]:
nonpresent_callback: rid=035 present UUID
555bdc68-06ee-443c-9b33-6c8355c0a00d, dn
ou=is,dc=authorise,dc=ed,dc=ac,dc=uk
Jun  5 07:35:20 cauth04.authorise.is.ed.ac.uk slapd[18267]:
nonpresent_callback: rid=035 present UUID
8749c8a0-0b06-4dc0-b1bd-ca85f3077603, dn
ou=hss,dc=authorise,dc=ed,dc=ac,dc=uk
Jun  5 07:35:20 cauth04.authorise.is.ed.ac.uk slapd[18267]:
nonpresent_callback: rid=035 present UUID
5d59b0fd-68f1-44b7-a2af-f4aec00c7857, dn
ou=mvm,dc=authorise,dc=ed,dc=ac,dc=uk
Jun  5 07:35:20 cauth04.authorise.is.ed.ac.uk slapd[18267]:
nonpresent_callback: rid=035 present UUID
d75a41f2-a279-4cf2-a8e6-033c6821b6b2, dn
ou=sce,dc=authorise,dc=ed,dc=ac,dc=uk

(At which point CPU usage on journald and rsyslog understandably go
through the roof).

There is a (possibly self-inflicted) point of pain where we have an
exattrs=memberOf in our syncrepl config to work around another
replication issue so this means that any user objects which are
REFRESH'ed end up losing all their group memberships.

I'm mitigating this at present by periodically dumping the data from the
"master" and re-generating the databases on the "consumers" but this is
more of a sticking plaster approach.

Are there any known bugs/ regressions with delta-syncrepl in 2.4.47?
One idea I've had is to go to a true single-master config by setting the
current consumers to be read-only and having a single olcSyncrepl clause
for the master on these 3 servers.

My config from one of the consumers is listed below although the master
config is identical bar the olcSyncrepl lines and olcSaslHost value.
I've redacted the schema and olcAccess entries and the root password:
If there's any obvious deficiencies in my config too that would be good
to know.

Kind regards,
Mark

dn: cn=config
objectClass: olcGlobal
cn: config
olcConfigFile: slapd.conf
olcConfigDir: slapd.d
olcArgsFile: /opt/openldap/var/run/slapd.args
olcAttributeOptions: lang-
olcAuthzPolicy: none
olcAuthzRegexp:
{0}"^uid=([^,/@]+)/ldapauth(@(((TEST\.)?EASE)|INF).ED.AC.UK)?,cn=(((TEST\.)?EASE)|INF).ED.AC.UK,cn=GSSAPI,cn=auth$"
"uid=$1,ou=people,ou=central,dc=authorise,dc=ed,dc=ac,dc=uk"
olcAuthzRegexp:
{1}"^uid=(host|ldap)/([^/]+).authorise.is.ed.ac.uk,cn=EASE.ED.AC.UK,cn=GSSAPI,cn=auth$"
"uid=replicator.authorise.is.ed.ac.uk,ou=people,ou=central,dc=authorise,dc=ed,dc=ac,dc=uk"
olcAuthzRegexp:
{2}"^uid=([^,/@]+)(@(((TEST\.)?EASE)|INF).ED.AC.UK)?,cn=(((TEST\.)?EASE)|INF).ED.AC.UK,cn=GSSAPI,cn=auth$"
"uid=$1,ou=people,ou=central,dc=authorise,dc=ed,dc=ac,dc=uk"
olcConcurrency: 0
olcConnMaxPending: 100
olcConnMaxPendingAuth: 1000
olcGentleHUP: FALSE
olcIdleTimeout: 0
olcIndexSubstrIfMaxLen: 4
olcIndexSubstrIfMinLen: 2
olcIndexSubstrAnyLen: 4
olcIndexSubstrAnyStep: 2
olcIndexIntLen: 4
olcLocalSSF: 71
olcLogFile: /opt/openldap/var/log/openldap-debug.log
olcLogLevel: stats
olcLogLevel: sync
olcPidFile: /opt/openldap/var/run/slapd.pid
olcReadOnly: FALSE
olcSaslHost: cauth04.authorise.is.ed.ac.uk
olcSaslRealm: EASE.ED.AC.UK
olcSaslSecProps: noplain,noanonymous
olcSecurity: ssf=1
olcSecurity: update_ssf=112
olcSecurity: simple_bind=64
olcServerID: 3 ldaps://cauth01.authorise.is.ed.ac.uk/
olcServerID: 4 ldaps://cauth02.authorise.is.ed.ac.uk/
olcServerID: 5 ldaps://cauth03.authorise.is.ed.ac.uk/
olcServerID: 6 ldaps://cauth04.authorise.is.ed.ac.uk/
olcSockbufMaxIncoming: 262143
olcSockbufMaxIncomingAuth: 16777215
olcThreads: 12
olcTLSCACertificateFile: /opt/openldap/certs/quovadis.bundle
olcTLSCertificateFile:
/opt/openldap/certs/authorise.is.ed.ac.uk-QUOVADIS.crt
olcTLSCertificateKeyFile:
/opt/openldap/certs/authorise.is.ed.ac.uk-QUOVADIS.key
olcTLSCipherSuite: HIGH:MEDIUM:+SSLv3:+SSLv2:+TLSv1:+AES256-SHA:!EXPORT
olcTLSCRLCheck: none
olcTLSVerifyClient: never
olcTLSProtocolMin: 3.1
olcToolThreads: 2
olcWriteTimeout: 900

dn: olcDatabase={-1}frontend,cn=config
objectClass: olcDatabaseConfig
objectClass: olcFrontendConfig
olcDatabase: {-1}frontend
olcAddContentAcl: FALSE
olcLastMod: TRUE
olcMaxDerefDepth: 2
olcReadOnly: FALSE
olcSchemaDN: cn=Subschema
olcSecurity: ssf=1
olcSecurity: update_ssf=112
olcSecurity: simple_bind=64
olcSyncUseSubentry: FALSE
olcMonitoring: FALSE
olcSortVals: member
olcSortVals: eduniServiceCode

dn: olcDatabase={0}config,cn=config
objectClass: olcDatabaseConfig
olcDatabase: {0}config
olcAccess: {0}to *  by
dn.regex="^uid=replicator\.authorise\.is\.ed\.ac\.uk,ou=people,ou=central,dc=authorise,dc=ed,dc=ac,dc=uk$"
read by * none
olcAddContentAcl: TRUE
olcLastMod: TRUE
olcMaxDerefDepth: 15
olcReadOnly: FALSE
olcRootDN: cn=Manager,cn=config
olcRootPW:: *************************
olcSyncUseSubentry: FALSE
olcMonitoring: FALSE

dn: olcDatabase={1}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {1}mdb
olcDbDirectory: /opt/openldap/var/openldap-data/authorise
olcSuffix: dc=authorise,dc=ed,dc=ac,dc=uk
olcAccess: *************
olcAddContentAcl: FALSE
olcLastMod: TRUE
olcMaxDerefDepth: 1
olcReadOnly: FALSE
olcRootDN: cn=Manager,dc=authorise,dc=ed,dc=ac,dc=uk
olcRootPW:: *********************
olcSecurity: ssf=1
olcSecurity: update_ssf=112
olcSecurity: simple_bind=64
olcSizeLimit: unlimited
olcSyncUseSubentry: FALSE
olcTimeLimit: unlimited
olcMirrorMode: TRUE
olcMonitoring: TRUE
olcDbCheckpoint: 1024 10
olcDbEnvFlags: writemap
olcDbEnvFlags: nometasync
olcDbNoSync: FALSE
olcDbIndex: objectClass eq
olcDbIndex: entryUUID eq
olcDbIndex: entryCSN eq
olcDbIndex: cn pres,eq,sub
olcDbIndex: uid pres,eq,sub
olcDbIndex: uidNumber pres,eq
olcDbIndex: gidNumber pres,eq
olcDbIndex: eduniType eq
olcDbIndex: gecos pres,eq,sub
olcDbIndex: eduniCategory eq
olcDbIndex: mail pres,eq,sub
olcDbIndex: eduniSchoolCode eq
olcDbIndex: eduniIDStatus eq
olcDbIndex: eduniCollegeCode eq
olcDbIndex: eduniUnitCode eq
olcDbIndex: eduniOrgCode eq
olcDbIndex: eduniLibraryBarcode pres,eq
olcDbIndex: eduniServiceCode pres,eq
olcDbIndex: krbName pres,eq
olcDbIndex: eduPersonAffiliation pres,eq
olcDbIndex: eduPersonPrimaryAffiliation eq
olcDbIndex: eduPersonPrincipalName eq
olcDbIndex: eduPersonEntitlement pres,eq
olcDbIndex: eduPersonTargetedID eq
olcDbIndex: eduPersonScopedAffiliation pres,eq
olcDbIndex: userPassword eq
olcDbIndex: eduniIdmsId pres,eq
olcDbIndex: sn pres,eq,sub
olcDbIndex: memberUid eq
olcDbIndex: uniqueMember pres,eq
olcDbIndex: autoMountKey eq
olcDbIndex: eduniOrganisation pres,eq,sub
olcDbIndex: description pres,eq,sub
olcDbIndex: eduniTitle pres,eq
olcDbIndex: title pres,eq,sub
olcDbIndex: eduniCardNumber pres,eq
olcDbIndex: eduniRefNo pres,eq
olcDbIndex: givenName pres,eq,sub
olcDbIndex: member eq
olcDbIndex: eduniYearOfStudy pres,eq
olcDbIndex: yubiKeyId pres,eq
olcDbIndex: nisMapName eq
olcDbIndex: memberOf eq
olcDbIndex: proxyAddresses pres,eq,sub
olcDbMaxReaders: 96
olcDbMaxSize: 32212254720
olcDbMode: 0600
olcDbSearchStack: 16
olcSyncrepl: {0}rid=34 provider=ldaps://cauth01.authorise.is.ed.ac.uk/
realm=EASE.ED.AC.UK bindmethod=sasl saslmech=gssapi
authcid="replicator.authorise.is.ed.ac.uk@EASE.ED.AC.UK"
searchbase="dc=authorise,dc=ed,dc=ac,dc=uk" logbase="cn=accesslog"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
schemachecking=on type=refreshAndPersist retry="60 +" tls_reqcert=never
timeout=1 keepalive=300:12:5 syncdata=accesslog exattrs="memberOf"
olcSyncrepl: {1}rid=35 provider=ldaps://cauth02.authorise.is.ed.ac.uk/
realm=EASE.ED.AC.UK bindmethod=sasl saslmech=gssapi
authcid="replicator.authorise.is.ed.ac.uk@EASE.ED.AC.UK"
searchbase="dc=authorise,dc=ed,dc=ac,dc=uk" logbase="cn=accesslog"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
schemachecking=on type=refreshAndPersist retry="60 +" tls_reqcert=never
timeout=1 keepalive=300:12:5 syncdata=accesslog exattrs="memberOf"
olcSyncrepl: {2}rid=36 provider=ldaps://cauth03.authorise.is.ed.ac.uk/
realm=EASE.ED.AC.UK bindmethod=sasl saslmech=gssapi
authcid="replicator.authorise.is.ed.ac.uk@EASE.ED.AC.UK"
searchbase="dc=authorise,dc=ed,dc=ac,dc=uk" logbase="cn=accesslog"
logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
schemachecking=on type=refreshAndPersist retry="60 +" tls_reqcert=never
timeout=1 keepalive=300:12:5 syncdata=accesslog exattrs="memberOf"
entryCSN: 20190528133331.277387Z#000000#006#000000

dn: olcOverlay={0}dynlist,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcDynamicList
olcOverlay: {0}dynlist
olcDlAttrSet: {0}groupOfURLs memberURL

dn: olcOverlay={1}memberof,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcMemberOf
olcOverlay: {1}memberof
olcMemberOfDangling: ignore
olcMemberOfRefInt: TRUE
olcMemberOfGroupOC: groupOfNames
olcMemberOfMemberAD: member
olcMemberOfMemberOfAD: memberOf

dn: olcOverlay={2}syncprov,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcConfig
objectClass: top
objectClass: olcSyncProvConfig
olcOverlay: {2}syncprov

dn: olcOverlay={3}accesslog,olcDatabase={1}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcAccessLogConfig
olcOverlay: {3}accesslog
olcAccessLogDB: cn=accesslog
olcAccessLogOps: writes
olcAccessLogPurge: 02+00:00 00+04:00
olcAccessLogSuccess: TRUE

dn: olcDatabase={2}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {2}mdb
olcDbDirectory: /opt/openldap/var/openldap-data/accesslog
olcSuffix: cn=accesslog
olcAccess: {0}to * by
dn.exact="uid=replicator.authorise.is.ed.ac.uk,ou=people,ou=central,dc=authorise,dc=ed,dc=ac,dc=uk"
write by dn="cn=Manager,dc=authorise,dc=ed,dc=ac,dc=uk" write
olcLimits:
{0}dn.exact="uid=replicator.authorise.is.ed.ac.uk,ou=people,ou=central,dc=authorise,dc=ed,dc=ac,dc=uk"
time.soft=unlimited time.hard=unlimited size.soft=unlimited
size.hard=unlimited
olcLimits: {1}dn.exact="cn=Manager,dc=authorise,dc=ed,dc=ac,dc=uk"
time.soft=unlimited time.hard=unlimited size.soft=unlimited
size.hard=unlimited
olcRootDN: cn=Manager,cn=accesslog
olcRootPW:: ********************************
olcDbIndex: default eq
olcDbIndex: entryCSN,objectClass,reqEnd,reqResult,reqStart,reqDN
olcDbMaxReaders: 96
olcDbMaxSize: 32212254720
olcDbMode: 0600
olcDbSearchStack: 16

dn: olcOverlay={0}syncprov,olcDatabase={2}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcConfig
objectClass: top
objectClass: olcSyncProvConfig
olcOverlay: {0}syncprov
olcSpNoPresent: TRUE
olcSpReloadHint: TRUE

dn: olcDatabase={3}monitor,cn=config
objectClass: olcMonitorConfig
objectClass: olcDatabaseConfig
objectClass: olcConfig
objectClass: top
olcDatabase: {3}monitor
olcAccess: {0}to dn.subtree="cn=monitor" by
dn.exact="cn=Manager,cn=config" write by * none

-- 
/****************************

Mark Cairney
ITI Enterprise Services
Information Services
University of Edinburgh

Tel: 0131 650 6565
Email: Mark.Cairney@ed.ac.uk
PGP: 0x435A9621

*******************************/

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.