[Date Prev][Date Next]
Re: back-bdb deadlocks?
I tested this on two of my test Linux boxes (RH9 and 7.3) and the problem
seems to be solved now.
It'll be great if anybody confirms this on different Linux distros /
There is still a problem remaining; it happens very very rarely. I've
triggered this problem in test008 by adding to testdata/do_search.0
which causes a search that includes entries that are the target of
Delete. There doesn't appear to be any other way to trigger this problem.
A read operation can cause an undetected deadlock when loading an entry
into the cache. This is because the lockerID used to lock the cache
entry is not the same as the lockerID BDB generates when accessing the
database. If a writer has already locked the id2entry pages and is
waiting on a write lock for the cache entry, then the reader will be
stuck waiting for the id2entry pages and neither will make any progress.
This implies that we must use transactions for readers as well as
writers, to insure that all locks associated with a read operation get
the same lockerID, so that deadlocks can be detected by BDB. Either that
or we need to tweak the lock ordering again so that a reader cannot hold
any other locks while it is accessing the database.
Still thinking about how to fix this...
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
Symas: Premier OpenSource Development and Support