[Date Prev][Date Next]
Re: back-bdb deadlocks?
I could also produce a deadlock case in both a Redhat 7.3 uniprocessor and a
Redhat 9 SMP box,
but I could get the full stack trace only in the Redhat 7.3 box. Weird gdb
behavior in Redhat 9...
Two files were uploaded to the incoming ftp directory:
ftp://ftp.openldap.org/incoming/test008-backtrace.txt for stack trace
ftp://ftp.openldap.org/incoming/test008-db_stat.txt for db_stat -CA result
I tracked down the problem further with gdb, and found the following:
thread 13: modrdn, locker=8000002b, waiting for an entryinfo mutex for
ei->bei_id=4 (ou=alumni association, ou=people, dc=example, dc=com) while
holding a read lock of page 1 and a db lock for the entry corresponding to
the entryinfo (ei->bei_id=4)
thread 5: add, locker=80000028, waiting for a db lock for the ei->bei_id=4
entry, while holding an entryinfo mutex for ei->bei_id=4
Why should the entryinfo lock be unlocked at the end of bdb_cache_find_id()
As exemplified in the thread 13, it should not be unlocked at least when it
is needed again.
The idea was to make tree navigation separate from retrieving the actual
entries, and to release the mutex as quickly as possible. At the
moment I'm testing with a version that uses a rdwr lock instead of a
mutex but that's failed as well. Your point about not unlocking makes
sense, I'll give that a try.
As I suggested some time back, the entryinfo mutex need be replaced with the
bdb lock as we did in the original bdb entry cache design.
You may be right. I've been trying to avoid that because they're heavier
overhead, but that's the only way to do lock-coupling here. Also, there
are some cleanup actions that have to occur under a lock, but are done
after Commit. That would be difficult to manage...
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
Symas: Premier OpenSource Development and Support