[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#5391) hdb deadlock



Full_Name: Aaron Richton
Version: 2.3.40
OS: Solaris 9
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (128.6.31.135)


One hdb backend on one slave died ~21:58 yesterday...

current thread: t@5
  [1] _libc_poll(0xffffffff4f3ff430, 0x0, 0x3e8, 0x0, 0x0, 0x0), at
0xffffffff7f0a741c
  [2] _select(0x3e8, 0xffffffff7f1bc728, 0xffffffff7f1bc728, 0x0,
0xffffffff7f1bc728, 0x0), at 0xffffffff7f05a74c
  [3] select(0x0, 0x0, 0x0, 0x0, 0xffffffff4f3ff5b0, 0x0), at
0xffffffff7e0108e8
=>[4] __os_sleep(dbenv = 0x1005b2610, secs = 1U, usecs = 0), line 84 in
"os_sleep.c"
  [5] __memp_sync_int(dbenv = 0x1005b2610, dbmfp = (nil), trickle_max = 0, op =
DB_SYNC_CACHE, wrotep = (nil)), line 362 in "mp_sync.c"
  [6] __memp_sync(dbenv = 0x1005b2610, lsnp = (nil)), line 99 in "mp_sync.c"
  [7] __txn_checkpoint(dbenv = 0x1005b2610, kbytes = 100000U, minutes = 10U,
flags = 0), line 1389 in "txn.c"
  [8] __txn_checkpoint_pp(dbenv = 0x1005b2610, kbytes = 100000U, minutes = 10U,
flags = 0), line 1288 in "txn.c"
  [9] hdb_checkpoint(ctx = 0xffffffff4f3ffc30, arg = 0x1004b4c60), line 165 in
"config.c"
  [10] ldap_int_thread_pool_wrapper(xpool = 0x10041e500), line 478 in "tpool.c"

(dbx) where
current thread: t@16
  [1] _libc_poll(0xffffffff46ffe3e0, 0x0, 0x3e8, 0x0, 0x0, 0x0), at
0xffffffff7f0a741c
  [2] _select(0x3e8, 0xffffffff7f1bc728, 0xffffffff7f1bc728, 0x0,
0xffffffff7f1bc728, 0x0), at 0xffffffff7f05a74c
  [3] select(0x0, 0x0, 0x0, 0x0, 0xffffffff46ffe560, 0x0), at
0xffffffff7e0108e8
=>[4] __os_sleep(dbenv = 0x1005b2610, secs = 1U, usecs = 0), line 84 in
"os_sleep.c"
  [5] __memp_sync_int(dbenv = 0x1005b2610, dbmfp = (nil), trickle_max = 0, op =
DB_SYNC_CACHE, wrotep = (nil)), line 439 in "mp_sync.c"
  [6] __memp_sync(dbenv = 0x1005b2610, lsnp = (nil)), line 99 in "mp_sync.c"
  [7] __txn_checkpoint(dbenv = 0x1005b2610, kbytes = 100000U, minutes = 10U,
flags = 0), line 1389 in "txn.c"
  [8] __txn_checkpoint_pp(dbenv = 0x1005b2610, kbytes = 100000U, minutes = 10U,
flags = 0), line 1288 in "txn.c"
  [9] hdb_delete(op = 0xffffffff46fff618, rs = 0xffffffff46fff088), line 537 in
"delete.c"
  [10] syncrepl_entry(si = 0x1004b4e50, op = 0xffffffff46fff618, entry = (nil),
modlist = 0xffffffff46fff320, syncstate = 3, syncUUID = 0xffffffff46fff3c0, 
                syncCookie_req = 0xffffffff46fff360, syncCSN =
0xffffffff46fff390), line 2006 in "syncrepl.c"
  [11] do_syncrep2(op = 0xffffffff46fff618, si = 0x1004b4e50), line 731 in
"syncrepl.c"
  [12] do_syncrepl(ctx = 0xffffffff46fffc30, arg = 0x1004b5030), line 1095 in
"syncrepl.c"
  [13] ldap_int_thread_pool_wrapper(xpool = 0x10041e500), line 478 in "tpool.c"


I can't get db_stat to join the environment. If there's anything else that can
be gleaned from slapd itself, I'd be glad to poke around the core; otherwise,
I'm off to rm/slapadd...

"This makes sense and shouldn't happen in 2.3.41" would be fine too, but none of
the changes (to my eye) looked locking related.