[Date Prev][Date Next] [Chronological] [Thread] [Top]

Locker killed to resolve a deadlock



openldap-2.4.36
db-4.8.30.NC
Solaris-10u11


Hello list,

We've been running those LDAP versions for years and we have been pretty
pleased with it.

But then "they" went and added an automated test script, that adds some
records every 10 mins, confirms they are replicated all the way to the
frontends, then deletes the records.

This has started to cause trouble, in the last month or so. The errors are:

Sep 22 06:08:26 vmx02 slapd[14039]: [ID 960831 local4.debug] =>
bdb_idl_delete_key: c_get failed: DB_LOCK_DEADLOCK: Locker killed to
resolve a deadlock (-30994)

Sep 22 06:13:37 vmx02 last message repeated 36 times

Sep 22 06:13:44 vmx02 slapd[14039]: [ID 960831 local4.debug] =>
bdb_idl_delete_key: c_get failed: DB_LOCK_DEADLOCK: Locker killed to
resolve a deadlock (-30994)

Sep 22 06:20:17 vmx02 last message repeated 39 times

Until we restart slapd. Setup is:

prov (1 writer) -> ldapmaster -> ldapslave(s) -> frontend(s)
                              -> ldapslave(s) -> frontend(s)
                                              -> frontend(s)

The problem happens the most on frontend servers, but occasionally on
ldapslaves, and twice now on ldapmaster. So we don't think it is syncrepl
related.

One answer would be to stop the 10min "ldap sync test" script of course,
but it would be nice to solve the actual problem.


slapd.conf, on slaves/frontend look like:


include         /usr/local/etc/openldap/schema/core.schema
include         /usr/local/etc/openldap/schema/cosine.schema
include         /usr/local/etc/openldap/schema/nis.schema
include         /usr/local/etc/openldap/company-schema/qmail.schema
include         /usr/local/etc/openldap/company-schema/local.schema
include         /usr/local/etc/openldap/company-schema/RADIUS-LDAPv3.schema
include         /usr/local/etc/openldap/company-schema/AmavisdLDAP.schema
include         /usr/local/etc/openldap/company-schema/analyse.schema
include         /usr/local/etc/openldap/company-schema/formail.schema
include         /usr/local/etc/openldap/company-schema/filter.schema
pidfile         /usr/local/var/run/slapd.pid
argsfile        /usr/local/var/run/slapd.args
loglevel        sync
database monitor
access to dn.subtree="cn=Monitor"
       by dn="cn=admin,dc=company,dc=jp" read
       by * none
database    bdb
suffix      "dc=company,dc=jp"
rootdn      "cn=admin,dc=company,dc=jp"
lastmod         on
checkpoint 128 15
overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 100
cachesize 10000
idlcachesize 15000
directory       /usr/local/var/openldap-data
index   objectClass     eq
index   uid                     eq
index   uidNumber               eq
index   mail                    eq
index   mailAlternateAddress    pres,eq
index   deliveryMode            eq
index   accountStatus           eq
index   gecos                   eq
index   radiusGroupName         eq
index   o                       pres,eq
index   entryCSN,entryUUID      eq

dbconfig set_lk_detect DB_LOCK_DEFAULT
dbconfig set_lg_max 52428800
dbconfig set_cachesize 0 67108864 1
dbconfig set_flags db_log_autoremove
dbconfig set_lk_max_objects 1500
dbconfig set_lk_max_locks 1500
dbconfig set_lk_max_lockers 1500

syncrepl   rid=345
                provider=ldap://ldapslave01
                type=refreshAndPersist
                interval=00:00:00:30
                searchbase="dc=company,dc=jp"
                filter="(objectClass=*)"
                attrs="*,+"
                scope=sub
                schemachecking=off
                bindmethod=simple

binddn="uid=replicate,o=companyserver.jp,ou=admin,dc=company,dc=jp"
                credentials="<secret>"
                retry="60 10 300 +"

updateref       ldap://ldapmaster01


-- 
Jorgen Lundman       | <lundman@lundman.net>
Unix Administrator   | +81 (0)90-5578-8500
Shibuya-ku, Tokyo    | Japan