[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: back-bdb deadlocks?



Jong-Hyuk wrote:

I've checked in a fix for this. Creating a new transaction for every
search operation would add a fair amount of overhead, so I've allocated
long-lived per-thread transactions instead (similar to the way locker
IDs are being re-used). We can't just replace the existing lockerID
mechanism with these transactions, because the intended lifetimes of
their locks are not the same. Anyway, I think this should finally
resolve the deadlock issues.

Please test...

Looks like we are heading to a right direction. One thing to note is that
there's currently no retry mechanism provided for the added transaction.
Also lacks the retry mechanism is bdb_cache_db_relock() itself. If it is
selected as a victim to resolve a deadlock, it needs to restart the op if
it's a write op, or it needs to restart the locking transaction if it's a
read op.

OK, I've added retries to the two remaining callers of bdb_cache_find_id() and made sure the lock return code is propagated back to the caller. That should be OK now.


Also, I still suspect that we need to protect db ops in a search op
in order to insure deadlock freedom
(http://www.sleepycat.com/docs/ref/lock/notxn.html), although it might be
costly.

I'm not sure that doc applies; this application is transactional, and all of our writes are protected. With this patch, most of the reads are protected as well and in general, a read operation never holds onto any page locks. Deadlock requires each thread to attempt to control multiple resources at once; with the exception of the brief transaction in bdb_cache_find_id() our read operations don't do this. They grab one entry at a time and release it before moving on to the next.
--
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support