[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: back-bdb deadlocks?

Jong-Hyuk wrote:

I tested this on two of my test Linux boxes (RH9 and 7.3) and the problem
seems to be solved now.
It'll be great if anybody confirms this on different Linux distros /
different OSes.
- Jong-Hyuk

There is still a problem remaining; it happens very very rarely. I've triggered this problem in test008 by adding to testdata/do_search.0
which causes a search that includes entries that are the target of Delete. There doesn't appear to be any other way to trigger this problem.

A read operation can cause an undetected deadlock when loading an entry into the cache. This is because the lockerID used to lock the cache entry is not the same as the lockerID BDB generates when accessing the database. If a writer has already locked the id2entry pages and is waiting on a write lock for the cache entry, then the reader will be stuck waiting for the id2entry pages and neither will make any progress.

This implies that we must use transactions for readers as well as writers, to insure that all locks associated with a read operation get the same lockerID, so that deadlocks can be detected by BDB. Either that or we need to tweak the lock ordering again so that a reader cannot hold any other locks while it is accessing the database.

Still thinking about how to fix this...
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support