[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd lock deadlock



At 10:14 AM 2/3/99 -0600, Robert Rothlisberger wrote:
>	I found a deadlock in the ldbm cache.
>
>	When a writer activity is running it creates a writer lock
>		id2entry -> entry_rdwr_loc()
>
>	However the cache is only locked when it is accessed.
>
>	The problem arrises when a call to cache_find_entry_id()
>		tries to aquire a reader lock.
>		entry_rdwr_lock() calls pthread_rdwr_rlock_np()
>		this routine loops until the writer lock clears.
>	The problem is that the writer can't complete because the search
>	locked the cache. So here we sit.

Yes, this is definitely not correct.   cache_find_entry_id()
should do entry_rdwr_trylock() and back off is busy.

cache_find_entry_dn2id() has a similiar problem.  However, has
it only access fields protected by the cache lock (which it
has acquired), acquiring the entry lock is not necessary.

In fact, cache_find_entry_id() can be greatly simpilified as
well in this regard.

I also found that our cache_return_entry_rw() code is incorrect.
It should acquire the cache lock before unlocking the entry
to prevent a race condition on the reference counter.

Of course, our reader/writer lock code needs to be updated
to provide trylock() calls such that entry_rdwr_trylock()
can be implemented.   I actually have everything coded but
need to wait for some pending commits to be made first.

Kurt