[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6090) slapd locks up; all slapd worker threads blocking on mutex acquisition in bdb_cache_lru_link()



On Fri, May 01, 2009 at 03:00:26PM -0700, Howard Chu wrote:
> jwm@horde.net wrote:
>> Full_Name: John Morrissey
>> Version: 2.4.16
>> OS: Linux
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (2001:4978:194:0:21f:5bff:fee9:da92)
>>
>> After a couple days of uptime, slapd no longer responds to incoming
>> connections (the connection would be accepted, but all LDAP operations
>> would block). All worker threads seem to be blocking on mutex acquisition
>> in bdb_cache_lru_link(). One thread was chewing lots of CPU.
>>
>> Backtrace is below. I also have a ~1.7GB core if it's deemed useful; I'll
>> keep it around for a week or two. This is with BDB 4.7.25+all three
>> patches.
>
> Interesting trace, it looks like all the active threads are waiting for 
> the mutex but apparently none of them owns it. Can you please provide the 
> contents of the mutex? e.g.
> 	thread 14
> 	frame 3
> 	print *mutex

(gdb) fra 3
#3  0xb7eec1cd in ldap_pvt_thread_mutex_lock (mutex=0x940a2cc)
    at /tmp/buildd/openldap-2.4.16/libraries/libldap_r/thr_posix.c:296
296             return ERRVAL( pthread_mutex_lock( mutex ) );
(gdb) print *mutex
$1 = {__data = {__lock = 2, __count = 0, __owner = 6372, __kind = 0, 
    __nusers = 1, {__spins = 0, __list = {__next = 0x0}}}, 
  __size = "\002\000\000\000\000\000\000\000###30\000\000\000\000\000\000\001\000\000\000\000\000\000", __align = 2}

LWP 6372 is the thread trying to do BDB lock promotion.

john
-- 
John Morrissey          _o            /\         ----  __o
jwm@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__