[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: back-bdb deadlocks?

To: Jong-Hyuk <jongchoi@OpenLDAP.org>
Subject: Re: back-bdb deadlocks?
From: Howard Chu <hyc@symas.com>
Date: Sat, 10 Jul 2004 10:01:31 -0700
Cc: openldap-devel@OpenLDAP.org
In-reply-to: <006701c46693$7d13c5d0$64644109@watson.ibm.com>
References: <40E4E0E0.20508@symas.com> <002801c4605e$3f008ed0$3ddb0209@watson.ibm.com> <40E75FB2.6070901@symas.com> <002201c464f0$9ef6a770$3ddb0209@watson.ibm.com> <40EE6526.2090006@symas.com> <40EEBFD5.7060804@symas.com> <006701c46693$7d13c5d0$64644109@watson.ibm.com>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a2) Gecko/20040622

Jong-Hyuk wrote:

I've checked in a fix for this. Creating a new transaction for every
search operation would add a fair amount of overhead, so I've allocated
long-lived per-thread transactions instead (similar to the way locker
IDs are being re-used). We can't just replace the existing lockerID
mechanism with these transactions, because the intended lifetimes of
their locks are not the same. Anyway, I think this should finally
resolve the deadlock issues.

Please test...

Looks like we are heading to a right direction. One thing to note is that
there's currently no retry mechanism provided for the added transaction.
Also lacks the retry mechanism is bdb_cache_db_relock() itself. If it is
selected as a victim to resolve a deadlock, it needs to restart the op if
it's a write op, or it needs to restart the locking transaction if it's a
read op.

OK, I've added retries to the two remaining callers of bdb_cache_find_id() and made sure the lock return code is propagated back to the caller. That should be OK now.

Also, I still suspect that we need to protect db ops in a search op
in order to insure deadlock freedom
(http://www.sleepycat.com/docs/ref/lock/notxn.html), although it might be
costly.

I'm not sure that doc applies; this application is transactional, and all of our writes are protected. With this patch, most of the reads are protected as well and in general, a read operation never holds onto any page locks. Deadlock requires each thread to attempt to control multiple resources at once; with the exception of the brief transaction in bdb_cache_find_id() our read operations don't do this. They grab one entry at a time and release it before moving on to the next. -- -- Howard Chu Chief Architect, Symas Corp. Director, Highland Sun http://www.symas.com http://highlandsun.com/hyc Symas: Premier OpenSource Development and Support

References:
- back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: "Jong-Hyuk" <jongchoi@OpenLDAP.org>
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: "Jong-Hyuk" <jongchoi@OpenLDAP.org>
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: "Jong-Hyuk" <jongchoi@OpenLDAP.org>

Prev by Date: Re: commit: ldap/servers/slapd/overlays pcache.c
Next by Date: Re: commit: ldap/servers/slapd/overlays pcache.c
Index(es):
- Chronological
- Thread