[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: back-bdb deadlocks?

To: "Howard Chu" <hyc@symas.com>
Subject: Re: back-bdb deadlocks?
From: "Jong-Hyuk" <jongchoi@OpenLDAP.org>
Date: Sat, 10 Jul 2004 11:34:58 -0400
Cc: <openldap-devel@OpenLDAP.org>
References: <40E4E0E0.20508@symas.com> <002801c4605e$3f008ed0$3ddb0209@watson.ibm.com> <40E75FB2.6070901@symas.com> <002201c464f0$9ef6a770$3ddb0209@watson.ibm.com> <40EE6526.2090006@symas.com> <40EEBFD5.7060804@symas.com>

> > A read operation can cause an undetected deadlock when loading an entry
> > into the cache. This is because the lockerID used to lock the cache
> > entry is not the same as the lockerID BDB generates when accessing the
> > database. If a writer has already locked the id2entry pages and is
> > waiting on a write lock for the cache entry, then the reader will be
> > stuck waiting for the id2entry pages and neither will make any progress.
> >
> > This implies that we must use transactions for readers as well as
> > writers, to insure that all locks associated with a read operation get
> > the same lockerID, so that deadlocks can be detected by BDB. Either that
> > or we need to tweak the lock ordering again so that a reader cannot hold
> > any other locks while it is accessing the database.
>
> I've checked in a fix for this. Creating a new transaction for every
> search operation would add a fair amount of overhead, so I've allocated
> long-lived per-thread transactions instead (similar to the way locker
> IDs are being re-used). We can't just replace the existing lockerID
> mechanism with these transactions, because the intended lifetimes of
> their locks are not the same. Anyway, I think this should finally
> resolve the deadlock issues.
>
> Please test...

Looks like we are heading to a right direction. One thing to note is that
there's currently no retry mechanism provided for the added transaction.
Also lacks the retry mechanism is bdb_cache_db_relock() itself. If it is
selected as a victim to resolve a deadlock, it needs to restart the op if
it's a write op, or it needs to restart the locking transaction if it's a
read op. Also, I still suspect that we need to protect db ops in a search op
in order to insure deadlock freedom
(http://www.sleepycat.com/docs/ref/lock/notxn.html), although it might be
costly.
- Jong-Hyuk

Follow-Ups:
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>

References:
- back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: "Jong-Hyuk" <jongchoi@OpenLDAP.org>
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: "Jong-Hyuk" <jongchoi@OpenLDAP.org>
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>
- Re: back-bdb deadlocks?
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: commit: ldap/servers/slapd/overlays pcache.c
Next by Date: Re: commit: ldap/servers/slapd/overlays pcache.c
Index(es):
- Chronological
- Thread