[Date Prev][Date Next] [Chronological] [Thread] [Top]

Ver2.4.23 - Delta syncrepl stops responding



I have a problem that has just started happening in the last few
weeks.  Replication of modifications and/or adds from one LDAP node to
the other will work fine ( repl time <1s) for a while, then suddenly
stop working all together.  The only way to get replication working
again is to restart each node 2 or 3 times (sometimes it comes back
after 2 restarts, sometimes it takes 3).  On the last restart, I
usually find that one node is stuck in an infinite loop trying to
process a transaction from the other node's accesslog - the only way
to get around this is to delete the the accesslog entry (see below for
accesslog entry and slapd output).   Any ideas?

OS: OpenSuse 11.1 64-bit w/ 2GB RAM
Version: OpenLDAP 2.4.23
Network Setup: 2 read/write nodes, with delta syncrepl between the nodes

SERVER 1 Configs:
dn: cn=config
objectClass: olcGlobal
cn: config
olcArgsFile: /var/run/slapd/slapd.args
olcAuthzRegexp:
{0}gidNumber=0\+uidNumber=0,cn=peercred,cn=external,cn=auth
dn:cn=config
olcLogLevel: stats sync
olcPidFile: /var/run/slapd/slapd.pid
olcServerID: 1 ldap://LDAP1:389
olcServerID: 2 ldap://LDAP2:389
olcSizeLimit: 10000

dn: olcDatabase={-1}frontend,cn=config
objectClass: olcDatabaseConfig
olcDatabase: {-1}frontend
olcAccess: {0}to dn.base="" by * read
olcAccess: {1}to dn.base="cn=Subschema" by * read
olcAccess: {2}to * by dn.base="cn=admin,o=o" manage

dn: olcDatabase={0}config,cn=config
objectClass: olcDatabaseConfig
olcDatabase: {0}config
olcAccess: {0}to * by dn.base="cn=admin,o=o" manage
olcAccess: {1}to * by * read
olcRootDN: cn=config

dn: olcDatabase={1}hdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcHdbConfig
olcDatabase: {1}hdb
olcDbDirectory: /var/lib/ldap
olcAccess: {0}to
attrs=pwdChangedTime,pwdAccountLockedTime,pwdFailureTime,pwdHistory,pwdGraceUseTime,pwdReset
by * none
olcAccess: {1}to attrs=userPassword by self write by * auth by
dn.base="cn=admin,o=o" manage
olcAccess: {2}to attrs=shadowLastChange by self write by * read
olcAccess: {3}to attrs=userPKCS12 by self read by * none
olcAccess: {4}to * by * read by dn.base="cn=admin,o=o" manage
olcDbCheckpoint: 1024 5
olcDbConfig: {0}set_cachesize 0 151200000 1
olcDbConfig: {1}set_lg_regionmax 262144
olcDbConfig: {2}set_lg_bsize 2097152
olcDbConfig: {3}set_flags DB_LOG_AUTOREMOVE
olcDbIDLcacheSize: 151200000
olcDbIndex: objectclass eq
olcDbIndex: uidNumber eq
olcDbIndex: gidNumber eq
olcDbIndex: member eq
olcDbIndex: memberUid eq
olcDbIndex: mail eq
olcDbIndex: cn eq,sub
olcDbIndex: displayName eq,sub
olcDbIndex: uid eq,sub
olcDbIndex: sn eq,sub
olcDbIndex: givenName eq,sub
olcDbIndex: entryCSN eq
olcDbIndex: entryUUID eq
olcLimits: {0}dn.exact="cn=admin,o=o" time.soft=unlimited
time.hard=unlimited size.soft=unlimited size.hard=unlimited
olcMirrorMode: TRUE
olcRootDN: cn=admin,o=o
olcRootPW: {SSHA}PWD
olcSuffix: o=o
olcSyncrepl: {0}rid=2
   provider=ldap://LDAP2:389
   binddn="cn=admin,o=o"
   bindmethod=simple
   credentials=PWD
   searchbase="ou=USERS,o=o"
   logbase="cn=accesslog"
   logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
   schemachecking=off
   type=refreshAndPersist
   retry="60 +"
   syncdata=accesslog

dn: olcOverlay={0}ppolicy,olcDatabase={1}hdb,cn=config
objectClass: olcPPolicyConfig
olcOverlay: {0}ppolicy
olcPPolicyDefault: cn=Default Policy,o=o
olcPPolicyHashCleartext: TRUE
olcPPolicyUseLockout: TRUE

dn: olcOverlay={1}syncprov,olcDatabase={1}hdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {1}syncprov
olcSpCheckpoint: 100 1
olcSpSessionlog: 100

dn: olcOverlay={2}accesslog,olcDatabase={1}hdb,cn=config
objectClass: olcAccessLogConfig
objectClass: olcOverlayConfig
objectClass: olcConfig
objectClass: top
olcAccessLogDB: cn=accesslog
olcOverlay: {2}accesslog
olcAccessLogOps: writes
olcAccessLogPurge: 03+00:00 01+00:00
olcAccessLogSuccess: TRUE

dn: olcDatabase={2}accesslog,cn=config
objectClass: olcDatabaseConfig
objectClass: olcHdbConfig
olcDatabase: {2}hdb
olcDbDirectory: /var/lib/ldap_accesslog
olcAccess:: ezB9dG8gKiBieSBkbi5iYXNlPSJjbj1hZG1pbixvPWNvIiBtYW5hZ2Ug
olcDbCheckpoint: 1024 5
olcDbConfig: {0}set_cachesize 0 9948000 1
olcDbConfig: {1}set_lg_regionmax 262144
olcDbConfig: {2}set_lg_bsize 2097152
olcDbConfig: {3}set_flags DB_LOG_AUTOREMOVE
olcDbIDLcacheSize: 9948000
olcDbIndex: entryCSN eq
olcDbIndex: objectClass eq
olcDbIndex: reqEnd eq
olcDbIndex: reqResult eq
olcDbIndex: reqStart eq
olcDbIndex: default eq
olcLimits: {0}dn.exact="cn=admin,o=o" time.soft=unlimited
time.hard=unlimited size.soft=unlimited size.hard=unlimited
olcSuffix: cn=accesslog

dn: olcOverlay={0}syncprov,olcDatabase={2}accesslog,cn=config
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {0}syncprov
olcSpCheckpoint: 100 1
olcSpNoPresent: TRUE
olcSpReloadHint: TRUE
olcSpSessionlog: 100

dn: olcDatabase={3}monitor,cn=config
objectClass: olcMonitorConfig
olcDatabase: {3}monitor
olcAccess: {0}to * by dn.base="cn=admin,o=o" manage
olcAccess: {1}to * by * none
olcMonitoring: true
--------------------------
SERVER 2 Configs:
version: 1

dn: cn=config
objectClass: olcGlobal
cn: config
olcArgsFile: /var/run/slapd/slapd.args
olcAuthzRegexp:
{0}gidNumber=0\+uidNumber=0,cn=peercred,cn=external,cn=auth
dn:cn=config
olcLogLevel: stats sync
olcPidFile: /var/run/slapd/slapd.pid
olcServerID: 1 ldap://LDAP1:389
olcServerID: 2 ldap://LDAP2:389
olcSizeLimit: 10000

dn: olcDatabase={-1}frontend,cn=config
objectClass: olcDatabaseConfig
olcDatabase: {-1}frontend
olcAccess: {0}to dn.base="" by * read
olcAccess: {1}to dn.base="cn=Subschema" by * read
olcAccess: {2}to * by dn.base="cn=admin,o=o" manage

dn: olcDatabase={0}config,cn=config
objectClass: olcDatabaseConfig
olcDatabase: {0}config
olcAccess: {0}to * by dn.base="cn=admin,o=o" manage
olcAccess: {1}to * by * read
olcRootDN: cn=config

dn: olcDatabase={1}hdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcHdbConfig
olcDatabase: {1}hdb
olcDbDirectory: /var/lib/ldap
olcAccess: {0}to
attrs=pwdChangedTime,pwdAccountLockedTime,pwdFailureTime,pwdHistory,pwdGraceUseTime,pwdReset
by * none
olcAccess: {1}to attrs=userPassword by self write by * auth by
dn.base="cn=admin,o=o" manage
olcAccess: {2}to attrs=shadowLastChange by self write by * read
olcAccess: {3}to attrs=userPKCS12 by self read by * none
olcAccess: {4}to * by * read by dn.base="cn=admin,o=o" manage
olcDbCheckpoint: 1024 5
olcDbConfig: {0}set_cachesize 0 151200000 1
olcDbConfig: {1}set_lg_regionmax 262144
olcDbConfig: {2}set_lg_bsize 2097152
olcDbConfig: {3}set_flags DB_LOG_AUTOREMOVE
olcDbIDLcacheSize: 151200000
olcDbIndex: objectclass eq
olcDbIndex: uidNumber eq
olcDbIndex: gidNumber eq
olcDbIndex: member eq
olcDbIndex: memberUid eq
olcDbIndex: mail eq
olcDbIndex: cn eq,sub
olcDbIndex: displayName eq,sub
olcDbIndex: uid eq,sub
olcDbIndex: sn eq,sub
olcDbIndex: givenName eq,sub
olcDbIndex: entryCSN eq
olcDbIndex: entryUUID eq
olcLimits: {0}dn.exact="cn=admin,o=o" time.soft=unlimited
time.hard=unlimited size.soft=unlimited size.hard=unlimited
olcMirrorMode: TRUE
olcRootDN: cn=admin,o=o
olcRootPW: {SSHA}PWD
olcSuffix: o=o
olcSyncrepl: {0}rid=1
   provider=ldap://LDAP1:389
   binddn="cn=admin,o=o"
   bindmethod=simple
   credentials=PWD
   searchbase="ou=USERS,o=o"
   logbase="cn=accesslog"
   logfilter="(&(objectClass=auditWriteObject)(reqResult=0))"
   schemachecking=off
   type=refreshAndPersist
   retry="60 +"
   syncdata=accesslog

dn: olcOverlay={0}ppolicy,olcDatabase={1}hdb,cn=config
objectClass: olcPPolicyConfig
olcOverlay: {0}ppolicy
olcPPolicyDefault: cn=Default Policy,o=o
olcPPolicyHashCleartext: TRUE
olcPPolicyUseLockout: TRUE

dn: olcOverlay={1}syncprov,olcDatabase={1}hdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {1}syncprov
olcSpCheckpoint: 100 1
olcSpSessionlog: 100

dn: olcOverlay={2}accesslog,olcDatabase={1}hdb,cn=config
objectClass: olcAccessLogConfig
objectClass: olcOverlayConfig
objectClass: olcConfig
objectClass: top
olcAccessLogDB: cn=accesslog
olcOverlay: {2}accesslog
olcAccessLogOps: writes
olcAccessLogPurge: 03+00:00 01+00:00
olcAccessLogSuccess: TRUE

dn: olcDatabase={2}accesslog,cn=config
objectClass: olcDatabaseConfig
objectClass: olcHdbConfig
olcDatabase: {2}hdb
olcDbDirectory: /var/lib/ldap_accesslog
olcAccess:: ezB9dG8gKiBieSBkbi5iYXNlPSJjbj1hZG1pbixvPWNvIiBtYW5hZ2Ug
olcDbCheckpoint: 1024 5
olcDbConfig: {0}set_cachesize 0 9948000 1
olcDbConfig: {1}set_lg_regionmax 262144
olcDbConfig: {2}set_lg_bsize 2097152
olcDbConfig: {3}set_flags DB_LOG_AUTOREMOVE
olcDbIDLcacheSize: 9948000
olcDbIndex: entryCSN eq
olcDbIndex: objectClass eq
olcDbIndex: reqEnd eq
olcDbIndex: reqResult eq
olcDbIndex: reqStart eq
olcDbIndex: default eq
olcLimits: {0}dn.exact="cn=admin,o=o" time.soft=unlimited
time.hard=unlimited size.soft=unlimited size.hard=unlimited
olcSuffix: cn=accesslog

dn: olcOverlay={0}syncprov,olcDatabase={2}accesslog,cn=config
objectClass: olcOverlayConfig
objectClass: olcSyncProvConfig
olcOverlay: {0}syncprov
olcSpCheckpoint: 100 1
olcSpNoPresent: TRUE
olcSpReloadHint: TRUE
olcSpSessionlog: 100

dn: olcDatabase={3}monitor,cn=config
objectClass: olcMonitorConfig
olcDatabase: {3}monitor
olcAccess: {0}to * by dn.base="cn=admin,o=o" manage
olcAccess: {1}to * by * none
olcMonitoring: true
--------------------------
Bad accesslog entry & slapd output
version: 1
####From Transaction recorded in accesslog on LDAP1
dn: reqStart=20111212033402.000002Z,cn=accesslog
objectClass: auditModify
reqAuthzID: cn=admin,o=co
reqDN: cn=bob,ou=users,o=co
reqEnd: 20111212033402.000003Z
reqMod: userPassword:= {SSHA}PWD
reqMod: pwdChangedTime:= 20111212033402Z
reqMod: entryCSN:= 2011 12 12 03 34 02.374746Z#000000#001#000000
reqMod: modifiersName:= cn=admin,o=o
reqMod: modifyTimestamp:= 20111212033402Z
reqResult: 0
reqSession: 3296
reqStart: 20111212033402.000002Z
reqType: modify

#####Log Of LDAP2 after 2nd restart (No Related Log info on LDAP1
during that timeframe)
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b639100 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_queue_csn: queing
0x7f093b746ea5 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: slap_graduate_commit_csn: removing
0x7f093b568240 20111212033402.374746Z#000000#001#000000
Dec 12 07:54:43 LDAP2 slapd[2732]: do_syncrep2: rid=001
cookie=rid=001,sid=001,csn=20111212033402.374746Z#000000#001#000000