[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7378) Slapd hangs on bdb write lock



Nikolai Schupbach wrote:
> Hi Howard,
> 
> Thank you very much for the explanation. What BDB version would you
recommend. Obviously I have quite a few options and would like to use a
version that is known to be very solid.

I believe 4.7.25 + all 4 of its official patches was pretty stable.
http://www.oracle.com/technetwork/products/berkeleydb/patch-088170.html

I've done limited testing with 4.8.30, 5.1.19, and 5.3.21. At this point I'm
no longer tracking BDB revisions since MDB has superior performance while
using 1/4 as much RAM and requiring no tuning.

> Sincerely,
> Nikolai Schupbach
> 
> On 3/09/2012, at 9:45 PM, Howard Chu wrote:
> 
>> nikolai@net24.co.nz wrote:
>>> Full_Name: Nikolai Schupbach
>>> Version: 2.4.31
>>> OS: FreeBSD
>>> URL: ftp://ftp.openldap.org/incoming/
>>> Submission from: (NULL) (202.78.158.60)
>>>
>>>
>>> We are experiencing frequent hangs in slapd. Once hung we can continue to
>>> connect, but all searches will just hang indefinitely until we kill -9 the slapd
>>> process and restart it. The directory is used for mail routing and we have been
>>> migrating to it from an existing directory server over the last 3 weeks - we
>>> have noted the busier the directory becomes the more often it hangs (now once
>>> every 2 days).
>>>
>>> We have one master and 10 syncrepl read only replicas - the master is used
>>> mainly for writes and has not hung yet, but most of the replicas have hung at
>>> least once. The replicas receive anywhere between 50 to 300 searches/sec, while
>>> the master would only get 1/sec. There are 45k entries in the directory.
>>>
>>> We are running:
>>>
>>> FreeBSD 8.3/9.0 x64
>>> OpenLDAP 2.4.31
>>> Berkeley DB 4.6.21
>>>
>>> The old directory we are migrating from has the same load and is also running
>>> OpenLDAP, but has been rock solid for 5 years. It is running Berkeley DB 4.3.29
>>> and OpenLDAP 2.3.27.
>>>
>>> We have managed to collect db_stat lock information, which indicates the same
>>> issue each time - a write lock on dn2id.bdb.
>>
>> It's more than that. Your db_stat shows that a single thread has 3 active
>> transactions. This should never happen:
>>
>> 8000a85e dd= 0 locks held 2    write locks 0    pid/thread 88000/34386526336
>> 8000a85e READ          1 HELD    0xb19a8 len:   9 data: 40xa800000000000000
>> 8000a85e READ          1 HELD    0xb26c8 len:   9 data: 60xa800000000000000
>> 8000a85f dd= 0 locks held 8    write locks 4    pid/thread 88000/34386526336
>> 8000a85f READ          1 WAIT    dn2id.bdb                 page        559
>> 8000a85f READ          1 HELD    dn2id.bdb                 page        768
>> 8000a85f WRITE         2 HELD    dn2id.bdb                 page       1362
>> 8000a85f READ          2 HELD    dn2id.bdb                 page       1362
>> 8000a85f WRITE         2 HELD    dn2id.bdb                 page       1353
>> 8000a85f READ          2 HELD    dn2id.bdb                 page       1353
>> 8000a85f WRITE         2 HELD    dn2id.bdb                 page        933
>> 8000a85f READ          1 HELD    dn2id.bdb                 page        933
>> 8000a85f WRITE         4 HELD    dn2id.bdb                 page        219
>> 80001047 dd=28 locks held 1    write locks 1    pid/thread 88000/34386526336
>> 80001047 WRITE         1 HELD    dn2id.bdb                 page        559
>>
>> I would first recommend changing from BDB 4.6.21 to some other version. There
>> are no code paths in back-bdb where we would ever return without either
>> committing or aborting the current transactions, so this appears to be a BDB
>> bug, not an OpenLDAP bug.
>>
>>> We have also collected the backtrace for all the threads which I have uploaded
>>> to:
>>>
>>> ftp://ftp.openldap.org/incoming/nikolai-gdb-120902.txt
>>>
>>> The full db_stat output is located at:
>>>
>>> ftp://ftp.openldap.org/incoming/nikolai-dbstat-120902.txt
>>
>> -- 
>>  -- Howard Chu
>>  CTO, Symas Corp.           http://www.symas.com
>>  Director, Highland Sun     http://highlandsun.com/hyc/
>>  Chief Architect, OpenLDAP  http://www.openldap.org/project/
> 
> 


-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/