[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: back-bdb deadlocks



I didn't build my libdb with debug symbols, but I have this trace:

(gdb) info thr
  36 Thread 12225  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  35 Thread 12224  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  34 Thread 12223  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  33 Thread 12222  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  32 Thread 12221  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  31 Thread 12220  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  30 Thread 12219  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  29 Thread 12218  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  28 Thread 12217  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  27 Thread 12216  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  26 Thread 12215  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  25 Thread 12214  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  24 Thread 12213  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  23 Thread 12212  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  22 Thread 12211  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  21 Thread 12210  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  20 Thread 12209  0x40239007 in __sched_yield () at soinit.c:27
  19 Thread 12196  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  18 Thread 12195  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  17 Thread 12192  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  16 Thread 12191  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  15 Thread 12190  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  14 Thread 12189  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  13 Thread 12188  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  12 Thread 12187  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
* 11 Thread 12186  0x40239007 in __sched_yield () at soinit.c:27
  10 Thread 12185  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  9 Thread 12184  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  8 Thread 12183  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  7 Thread 12182  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  6 Thread 12181  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  5 Thread 12180  0x40239007 in __sched_yield () at soinit.c:27
  4 Thread 12175  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  3 Thread 12174  0x4022d511 in __libc_nanosleep () at soinit.c:27
  2 Thread 12171  0x402440e4 in __syscall_sigsuspend () at soinit.c:27
  1 Thread 12173  0x4023f0de in __select () at soinit.c:27
(gdb) thr 5
[Switching to Thread 12180]
#0  0x40239007 in __sched_yield () at soinit.c:27
soinit.c:27: No such file or directory.
(gdb) where
#0  0x40239007 in __sched_yield () at soinit.c:27
#1  0x80ce9af in ldap_pvt_thread_yield ()
    at /home/hyc/OD/head/libraries/libldap_r/thr_posix.c:168
#2  0x40079821 in __os_yield ()
#3  0x4001a26e in __db_tas_mutex_lock ()
#4  0x40066a3d in __lock_get_internal ()
#5  0x400660c3 in __lock_vec ()
#6  0x40087ac4 in __txn_abort ()
#7  0x4008700d in txn_abort ()
#8  0x80a0d13 in bdb_add (be=0x815bfb8, conn=0x407c1b78, op=0x40802ef8,
    e=0x816c778) at /home/hyc/OD/head/servers/slapd/back-bdb/add.c:292
#9  0x8056c67 in do_add (conn=0x407c1b78, op=0x40802ef8)
    at /home/hyc/OD/head/servers/slapd/add.c:289
#10 0x8052401 in connection_operation (arg_v=0x40801ab8)
    at /home/hyc/OD/head/servers/slapd/connection.c:932
#11 0x80ce71f in ldap_int_thread_pool_wrapper (pool=0x812a2f8)
    at /home/hyc/OD/head/libraries/libldap_r/tpool.c:402
#12 0x401b7587 in pthread_start_thread (arg=0xbf5ffea4) at manager.c:192
(gdb) thr 11
[Switching to Thread 12186]
#0  0x40239007 in __sched_yield () at soinit.c:27
soinit.c:27: No such file or directory.
(gdb) where
#0  0x40239007 in __sched_yield () at soinit.c:27
#1  0x80ce9af in ldap_pvt_thread_yield ()
    at /home/hyc/OD/head/libraries/libldap_r/thr_posix.c:168
#2  0x40079821 in __os_yield ()
#3  0x4001a26e in __db_tas_mutex_lock ()
#4  0x40066a3d in __lock_get_internal ()
#5  0x40065b5b in __lock_vec ()
#6  0x40043e85 in __db_lget ()
#7  0x40027748 in __bam_search ()
#8  0x4001ec60 in __bam_c_search ()
#9  0x4001c808 in __bam_c_get ()
#10 0x4003c653 in __db_c_get ()
#11 0x40037949 in __db_delete ()
#12 0x80a771b in bdb_id2entry_delete (be=0x815bfb8, tid=0x816b9e8, id=563)
    at /home/hyc/OD/head/servers/slapd/back-bdb/id2entry.c:132
#13 0x80a3695 in bdb_delete (be=0x815bfb8, conn=0x407c2224, op=0x40801a38,
    dn=0xbe9ffcfc, ndn=0xbe9ffcf4)
    at /home/hyc/OD/head/servers/slapd/back-bdb/delete.c:272
#14 0x8067eee in do_delete (conn=0x407c2224, op=0x40801a38)
    at /home/hyc/OD/head/servers/slapd/delete.c:172
#15 0x8052421 in connection_operation (arg_v=0x40802fc8)
    at /home/hyc/OD/head/servers/slapd/connection.c:936
#16 0x80ce71f in ldap_int_thread_pool_wrapper (pool=0x812a2f8)
    at /home/hyc/OD/head/libraries/libldap_r/tpool.c:402
#17 0x401b7587 in pthread_start_thread (arg=0xbe9ffea4) at manager.c:192
(gdb) thr 20
[Switching to Thread 12209]
#0  0x40239007 in __sched_yield () at soinit.c:27
soinit.c:27: No such file or directory.
(gdb) where
#0  0x40239007 in __sched_yield () at soinit.c:27
#1  0x80ce9af in ldap_pvt_thread_yield ()
    at /home/hyc/OD/head/libraries/libldap_r/thr_posix.c:168
#2  0x40079821 in __os_yield ()
#3  0x4001a26e in __db_tas_mutex_lock ()
#4  0x40066a3d in __lock_get_internal ()
#5  0x40065b5b in __lock_vec ()
#6  0x40043e85 in __db_lget ()
#7  0x40027748 in __bam_search ()
#8  0x4001ec60 in __bam_c_search ()
#9  0x4001c808 in __bam_c_get ()
#10 0x4003c653 in __db_c_get ()
#11 0x4003770c in __db_put ()
#12 0x80a74b5 in bdb_id2entry_put (be=0x815bfb8, tid=0x8182f28, e=0x81817e8,
    flag=25) at /home/hyc/OD/head/servers/slapd/back-bdb/id2entry.c:50
#13 0x80a751a in bdb_id2entry_add (be=0x815bfb8, tid=0x8182f28, e=0x81817e8)
    at /home/hyc/OD/head/servers/slapd/back-bdb/id2entry.c:61
#14 0x80a0ad4 in bdb_add (be=0x815bfb8, conn=0x407c1c6c, op=0x40801b88,
    e=0x81817e8) at /home/hyc/OD/head/servers/slapd/back-bdb/add.c:252
#15 0x8056c67 in do_add (conn=0x407c1c6c, op=0x40801b88)
    at /home/hyc/OD/head/servers/slapd/add.c:289
#16 0x8052401 in connection_operation (arg_v=0x40802770)
    at /home/hyc/OD/head/servers/slapd/connection.c:932
#17 0x80ce71f in ldap_int_thread_pool_wrapper (pool=0x812a2f8)
    at /home/hyc/OD/head/libraries/libldap_r/tpool.c:402
#18 0x401b7587 in pthread_start_thread (arg=0xbd7ffea4) at manager.c:192
(gdb)

Here's the output from db_stat -c:
# db_stat -c
54 Last allocated locker ID.
9       Number of lock modes.
1000    Maximum number of locks possible.
1000    Maximum number of lockers possible.
1000    Maximum number of objects possible.
99899   Current locks.
99901   Maximum number of locks so far.
44      Current number of lockers.
44      Maximum number  lockers so far.
49      Current number lock objects.
64      Maximum number of lock objects so far.
181846  Number of lock requests.
215845  Number of lock releases.
1       Number of lock requests that would have waited.
80      Number of lock conflicts.
0       Number of deadlocks.
0       Number of transaction timeouts.
0       Number of lock timeouts.
352KB   Lock region size (360448 bytes).
11      The number of region locks granted after waiting.
273545  The number of region locks granted without waiting.

Note that libdb doesn't think any deadlocks have occurred yet, even though
I had deadlock detection configured to kill the atransaction with the fewest
locks.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

> -----Original Message-----
> From: Marijn Meijles [mailto:marijn@bitpit.net]
> Sent: Thursday, January 17, 2002 4:14 AM
> To: Howard Chu
> Cc: openldap-devel@OpenLDAP.org
> Subject: Re: back-bdb deadlocks
>
>
> You wrote:
> > I've cleaned things up and recompiled/linked with BDB 4.0.14 (was
> using 3.3.11)
> > and the same deadlock occurs in txn_abort. You're right, this
> sounds like a bug
> > for Sleepycat to address. I cannot find any code path that bdb_add executes
> > without a transaction, so it appears we're doing the right things.
> >
> Where in __txn_abort does it lock up exactly?
>
> --
> Marijn@bitpit.net
> ---
> When everything comes your way, you're in the wrong lane.
>