[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#6953) pcache deadlocks when refreshing cached queries



Full_Name: Ralf Haferkamp
Version: 2.4.25/HEAD
OS: 
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (89.166.172.234)
Submitted by: ralf


Under certain circumstances slapo-pcache causes a deadlock in slapd when
refreshing a cached query. The reason seems to be that, slapo-pcache considers
the refresh search, issued from the consitency checker task, to be answerable
from the local cache database (this only seems to happen when the pcacheAttrset
contains the "objectclass" Attribute). slapo-pcache will search the cache db
with the response callback pointing to be_modify() (indirectly through
refresh_merge()) of the same database. When using bdb/hdb as the cache database
this cause the task request a write lock on the cached entry with in one bdb
transaction while already holding a read lock (from the internal search) through
another TXN. slapd is deadlocked after that. See below for a gdb backtrace taken
from that situation.

The fix seems to be fairly easy. Just do not lookup the cache db when refreshing
a cached query. It doesn't make sense anyway. I'll commit something for that
shortly.

#0  pthread_cond_wait@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007f2b88eaf519 in __db_pthread_mutex_lock (env=0xa5ebb0, mutex=<value
optimized out>)
    at ../mutex/mut_pthread.c:318
#2  0x00007f2b88f4252e in __lock_get_internal (lt=0xa5efc0, sh_locker=<value
optimized out>, 
    flags=0, obj=0x14060, lock_mode=<value optimized out>, timeout=2169475480,
lock=0x7f2b814f93a8)
    at ../lock/lock.c:953
#3  0x00007f2b88f4350c in __lock_vec (env=0xa5ebb0, sh_locker=0x7f2b858acf38,
flags=0, 
    list=0x7f2b814f9360, nlist=2, elistp=0x0) at ../lock/lock.c:136
#4  0x00007f2b88f43c48 in __lock_vec_api (dbenv=<value optimized out>,
lid=2147483672, flags=0, 
    list=0x7f2b814f9360, nlist=2, elistp=0x0) at ../lock/lock.c:84
#5  __lock_vec_pp (dbenv=<value optimized out>, lid=2147483672, flags=0,
list=0x7f2b814f9360, 
    nlist=2, elistp=0x0) at ../lock/lock.c:66
#6  0x000000000052a5d3 in bdb_cache_entry_db_relock (bdb=0x9c7a20, txn=0xa66670,
ei=0xd71720, rw=1, 
    tryOnly=0, lock=0x7f2b814f94d0) at
../../../../servers/slapd/back-bdb/cache.c:198
#7  0x000000000052c1e7 in bdb_cache_modify (bdb=0x9c7a20, e=0x7f2b85445068,
newAttrs=0x7f2b8267e258, 
    txn=0xa66670, lock=0x7f2b814f94d0) at
../../../../servers/slapd/back-bdb/cache.c:1231
#8  0x00000000004f3f2d in bdb_modify (op=0x7f2b8167a430, rs=0x7f2b814f9820)
    at ../../../../servers/slapd/back-bdb/modify.c:711
#9  0x000000000059e458 in refresh_merge (op=0x7f2b8167a430, rs=0x7f2b8167a370)
    at ../../../../servers/slapd/overlays/pcache.c:3275
#10 0x000000000045c207 in slap_response_play (op=0x7f2b8167a430,
rs=0x7f2b8167a370)
    at ../../../servers/slapd/result.c:505
#11 0x000000000045dd28 in slap_send_search_entry (op=0x7f2b8167a430,
rs=0x7f2b8167a370)
    at ../../../servers/slapd/result.c:997
#12 0x00000000004fbfd0 in bdb_search (op=0x7f2b8167a430, rs=0x7f2b8167a370)
---Type <return> to continue, or q <return> to quit---
    at ../../../../servers/slapd/back-bdb/search.c:962
#13 0x000000000059d86e in pcache_op_search (op=0x7f2b8167a430,
rs=0x7f2b8167a370)
    at ../../../../servers/slapd/overlays/pcache.c:3037
#14 0x00000000004d9ba5 in overlay_op_walk (op=0x7f2b8167a430, rs=0x7f2b8167a370,
which=op_search, 
    oi=0x9c7430, on=0x9c7610) at ../../../servers/slapd/backover.c:661
#15 0x00000000004d9e52 in over_op_func (op=0x7f2b8167a430, rs=0x7f2b8167a370,
which=op_search)
    at ../../../servers/slapd/backover.c:723
#16 0x00000000004d9f3a in over_op_search (op=0x7f2b8167a430, rs=0x7f2b8167a370)
    at ../../../servers/slapd/backover.c:750
#17 0x000000000059e9bb in refresh_query (op=0x7f2b8167a430, query=0xa66580,
on=0x9c7610)
    at ../../../../servers/slapd/overlays/pcache.c:3371
#18 0x000000000059f03d in consistency_check (ctx=0x7f2b8167ab60, arg=0xa61910)
    at ../../../../servers/slapd/overlays/pcache.c:3492