[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd hungs after being up for over a day under load (ITS#2952)



Hi,
   Here is a new backtrace of the server being hung after a day of uptime under
load.

Brian


(gdb) info threads
  18 Thread 1088744400 (LWP 23778)  0x4041b5d7 in select ()
   from /lib/tls/libc.so.6
  17 Thread 1097133008 (LWP 23779)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  16 Thread 1105521616 (LWP 23902)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  15 Thread 1115683792 (LWP 23994)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  14 Thread 1124072400 (LWP 24081)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  13 Thread 1133509584 (LWP 24104)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  12 Thread 1141898192 (LWP 24132)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  11 Thread 1150286800 (LWP 24133)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  10 Thread 1158675408 (LWP 24742)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
() from /lib/tls/libpthread.so.0
  9 Thread 1167064016 (LWP 24743)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  8 Thread 1175452624 (LWP 25506)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  7 Thread 1183841232 (LWP 28037)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  6 Thread 1193278416 (LWP 7557)  0x4041b5d7 in select ()
   from /lib/tls/libc.so.6
  5 Thread 1201667024 (LWP 7558)  0x4041b5d7 in select ()
   from /lib/tls/libc.so.6
  4 Thread 1210055632 (LWP 7559)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  3 Thread 1218444240 (LWP 4658)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  2 Thread 1226832848 (LWP 4660)  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2
    () from /lib/tls/libpthread.so.0
  1 Thread 1078463360 (LWP 23777)  0x40174aad in pthread_join ()
   from /lib/tls/libpthread.so.0

(gdb) attach 23777
Attaching to program: /usr/sbin/slapd, process 23777
Reading symbols from /usr/lib/libldap_r.so.2...done.
Loaded symbols for /usr/lib/libldap_r.so.2
Reading symbols from /usr/lib/liblber.so.2...done.
Loaded symbols for /usr/lib/liblber.so.2
Reading symbols from /usr/lib/libdb-4.1.so...done.
Loaded symbols for /usr/lib/libdb-4.1.so
Reading symbols from /usr/lib/libiodbc.so.2...done.
Loaded symbols for /usr/lib/libiodbc.so.2
Reading symbols from /usr/lib/libiodbcinst.so.2...done.
Loaded symbols for /usr/lib/libiodbcinst.so.2
Reading symbols from /lib/tls/libpthread.so.0...done.
[New Thread 1078463360 (LWP 23777)]
[New Thread 1226832848 (LWP 4660)]
[New Thread 1218444240 (LWP 4658)]
[New Thread 1210055632 (LWP 7559)]
[New Thread 1201667024 (LWP 7558)]
[New Thread 1193278416 (LWP 7557)]
[New Thread 1183841232 (LWP 28037)]
[New Thread 1175452624 (LWP 25506)]
[New Thread 1167064016 (LWP 24743)]
[New Thread 1158675408 (LWP 24742)]
[New Thread 1150286800 (LWP 24133)]
[New Thread 1141898192 (LWP 24132)]
[New Thread 1133509584 (LWP 24104)]
[New Thread 1124072400 (LWP 24081)]
[New Thread 1115683792 (LWP 23994)]
[New Thread 1105521616 (LWP 23902)]
[New Thread 1097133008 (LWP 23779)]
[New Thread 1088744400 (LWP 23778)]
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /usr/lib/libslp.so.1...done.
Loaded symbols for /usr/lib/libslp.so.1
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/tls/libnsl.so.1...done.
Loaded symbols for /lib/tls/libnsl.so.1
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
Reading symbols from /usr/lib/i686/cmov/libssl.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libssl.so.0.9.7
Reading symbols from /usr/lib/i686/cmov/libcrypto.so.0.9.7...done.
Loaded symbols for /usr/lib/i686/cmov/libcrypto.so.0.9.7
Reading symbols from /lib/tls/libcrypt.so.1...done.
Loaded symbols for /lib/tls/libcrypt.so.1
Reading symbols from /lib/tls/libresolv.so.2...done.
Loaded symbols for /lib/tls/libresolv.so.2
Reading symbols from /usr/lib/libltdl.so.3...done.
Loaded symbols for /usr/lib/libltdl.so.3
Reading symbols from /lib/tls/libdl.so.2...done.
Loaded symbols for /lib/tls/libdl.so.2
Reading symbols from /lib/libwrap.so.0...done.
Loaded symbols for /lib/libwrap.so.0
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/tls/libnss_files.so.2...done.
Loaded symbols for /lib/tls/libnss_files.so.2
Reading symbols from /usr/lib/sasl2/libsasldb.so.2...done.
Loaded symbols for /usr/lib/sasl2/libsasldb.so.2
Reading symbols from /usr/lib/libdb3.so.3...done.
Loaded symbols for /usr/lib/libdb3.so.3
Reading symbols from /usr/lib/sasl2/libcrammd5.so.2...done.
Loaded symbols for /usr/lib/sasl2/libcrammd5.so.2
Reading symbols from /usr/lib/sasl2/libdigestmd5.so.2...done.
Loaded symbols for /usr/lib/sasl2/libdigestmd5.so.2
Reading symbols from /usr/lib/sasl2/libotp.so.2...done.
Loaded symbols for /usr/lib/sasl2/libotp.so.2
Reading symbols from /usr/lib/sasl2/libanonymous.so.2...done.
Loaded symbols for /usr/lib/sasl2/libanonymous.so.2
Reading symbols from /usr/lib/sasl2/libplain.so.2...done.
Loaded symbols for /usr/lib/sasl2/libplain.so.2
Reading symbols from /usr/lib/sasl2/liblogin.so.2...done.
Loaded symbols for /usr/lib/sasl2/liblogin.so.2
Reading symbols from /usr/lib/sasl2/libntlm.so.2...done.
Loaded symbols for /usr/lib/sasl2/libntlm.so.2
Reading symbols from /usr/lib/ldap/back_ldbm.so...done.
Loaded symbols for /usr/lib/ldap/back_ldbm.so
0x40174aad in pthread_join () from /lib/tls/libpthread.so.0
(gdb) thread apply all bt

Thread 18 (Thread 1088744400 (LWP 23778)):
#0  0x4041b5d7 in select () from /lib/tls/libc.so.6
#1  0x4017bb74 in __JCR_LIST__ () from /lib/tls/libpthread.so.0
#2  0x40173964 in start_thread () from /lib/tls/libpthread.so.0
#3  0x0812145c in ?? ()

Thread 17 (Thread 1097133008 (LWP 23779)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 16 (Thread 1105521616 (LWP 23902)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 15 (Thread 1115683792 (LWP 23994)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 14 (Thread 1124072400 (LWP 24081)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 13 (Thread 1133509584 (LWP 24104)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 12 (Thread 1141898192 (LWP 24132)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 11 (Thread 1150286800 (LWP 24133)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 10 (Thread 1158675408 (LWP 24742)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

---Type <return> to continue, or q <return> to quit--- 
Thread 9 (Thread 1167064016 (LWP 24743)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 8 (Thread 1175452624 (LWP 25506)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 7 (Thread 1183841232 (LWP 28037)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 6 (Thread 1193278416 (LWP 7557)):
#0  0x4041b5d7 in select () from /lib/tls/libc.so.6
#1  0x40124448 in db_xa_switch_4001 () from /usr/lib/libdb-4.1.so
#2  0x400f2c93 in __memp_alloc_4001 () from /usr/lib/libdb-4.1.so
#3  0x400f40c1 in __memp_fget_4001 () from /usr/lib/libdb-4.1.so
#4  0x40095f3a in __bam_search_4001 () from /usr/lib/libdb-4.1.so
#5  0x4008c150 in __bam_c_rget_4001 () from /usr/lib/libdb-4.1.so
#6  0x400895d5 in __bam_c_dup_4001 () from /usr/lib/libdb-4.1.so
#7  0x400aae99 in __db_c_get_4001 () from /usr/lib/libdb-4.1.so
#8  0x400a4d84 in __db_get_4001 () from /usr/lib/libdb-4.1.so
#9  0x40577b55 in ldbm_fetch (ldbm=0x8125548, key=
      {data = 0x471fe6b8, size = 4, ulen = 0, dlen = 0, doff = 0, flags = 0})
    at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/ldbm.c:443
#10 0x4056c526 in id2entry_rw (be=0x8113d18, id=41867, rw=0)
    at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/id2entry.c:220
#11 0x40567753 in ldbm_back_search (be=0x8113d18, conn=0x405fda28, 
    op=0x81aba70, base=0x471ff8a8, nbase=0x471ff8a0, scope=2, deref=2, 
    slimit=1746, tlimit=3600, filter=0x41f62d68, filterstr=0x471ff898, 
    attrs=0x0, attrsonly=0)
    at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/search.c:316
#12 0x08059cd4 in do_search (conn=0x405fda28, op=0x81aba70)
    at /home/masneyb/openldap-2.1.25/servers/slapd/search.c:401
#13 0x08057cdf in connection_operation (ctx=0x46972a98, arg_v=0x81aba70)
    at /home/masneyb/openldap-2.1.25/servers/slapd/connection.c:943
#14 0x40027258 in ldap_int_thread_pool_wrapper (xpool=0x80be690)
    at /home/masneyb/openldap-2.1.25/libraries/libldap_r/tpool.c:432
#15 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
#16 0x41f37ed4 in ?? ()

Thread 5 (Thread 1201667024 (LWP 7558)):
#0  0x4041b5d7 in select () from /lib/tls/libc.so.6
#1  0x40124448 in db_xa_switch_4001 () from /usr/lib/libdb-4.1.so
---Type <return> to continue, or q <return> to quit---
#2  0x400f2c93 in __memp_alloc_4001 () from /usr/lib/libdb-4.1.so
#3  0x400f40c1 in __memp_fget_4001 () from /usr/lib/libdb-4.1.so
#4  0x40095f3a in __bam_search_4001 () from /usr/lib/libdb-4.1.so
#5  0x4008c150 in __bam_c_rget_4001 () from /usr/lib/libdb-4.1.so
#6  0x400895d5 in __bam_c_dup_4001 () from /usr/lib/libdb-4.1.so
#7  0x400aae99 in __db_c_get_4001 () from /usr/lib/libdb-4.1.so
#8  0x400a4d84 in __db_get_4001 () from /usr/lib/libdb-4.1.so
#9  0x40577b55 in ldbm_fetch (ldbm=0x8125548, key=
      {data = 0x479fe6b8, size = 4, ulen = 0, dlen = 0, doff = 0, flags = 0})
    at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/ldbm.c:443
#10 0x4056c526 in id2entry_rw (be=0x8113d18, id=41847, rw=0)
    at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/id2entry.c:220
#11 0x40567753 in ldbm_back_search (be=0x8113d18, conn=0x405fe448, 
    op=0x81b2648, base=0x479ff8a8, nbase=0x479ff8a0, scope=2, deref=2, 
    slimit=1747, tlimit=3600, filter=0x49635270, filterstr=0x479ff898, 
    attrs=0x0, attrsonly=0)
    at /home/masneyb/openldap-2.1.25/servers/slapd/back-ldbm/search.c:316
#12 0x08059cd4 in do_search (conn=0x405fe448, op=0x81b2648)
    at /home/masneyb/openldap-2.1.25/servers/slapd/search.c:401
#13 0x08057cdf in connection_operation (ctx=0x41f89ba0, arg_v=0x81b2648)
    at /home/masneyb/openldap-2.1.25/servers/slapd/connection.c:943
#14 0x40027258 in ldap_int_thread_pool_wrapper (xpool=0x80be690)
    at /home/masneyb/openldap-2.1.25/libraries/libldap_r/tpool.c:432
#15 0x40173964 in start_thread () from /lib/tls/libpthread.so.0
#16 0x41f6db6c in ?? ()

Thread 4 (Thread 1210055632 (LWP 7559)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 3 (Thread 1218444240 (LWP 4658)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 2 (Thread 1226832848 (LWP 4660)):
#0  0x401760d5 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib/tls/libpthread.so.0
#1  0x00000000 in ?? ()

Thread 1 (Thread 1078463360 (LWP 23777)):
#0  0x40174aad in pthread_join () from /lib/tls/libpthread.so.0
#0  0x40174aad in pthread_join () from /lib/tls/libpthread.so.0


On Thu, Feb 05, 2004 at 09:59:13PM -0800, Howard Chu wrote:
> There is not enough information here to draw any useful conclusions. Try
> getting more info out of gdb, e.g.
>   info threads
>   thread apply all bt
> 
> Also, it appears that your symbol information is not intact in this
> backtrace, there's no way "ldap_pvt_thread_pool_destroy" would be an ancestor
> of any of these calls. Please make sure you use the correct binary when
> attaching the debugger. I don't believe the trace you provided bears any
> relation to reality.
> 
>   -- Howard Chu
>   Chief Architect, Symas Corp.       Director, Highland Sun
>   http://www.symas.com               http://highlandsun.com/hyc
>   Symas: Premier OpenSource Development and Support
> 
> > -----Original Message-----
> > From: owner-openldap-bugs@OpenLDAP.org
> > [mailto:owner-openldap-bugs@OpenLDAP.org]On Behalf Of masneyb@gftp.org
> 
> > Full_Name: Brian Masney
> > Version: 2.1.25 (20031217)
> > OS: Debian GNU/Linux
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (216.12.23.12)
> >
> >
> > There is a bug in slapd that it will hang whenever it's up
> > for more than a day.
> > It will accept a TCP connection but it will not perform any
> > kind of reads and
> > writes.
> > On our main LDAP master server, after slapd hung on the
> > slave, the entry it hung
> > on was a delete request. I initially did a strace on the hung
> > slapd process and
> > it showed this:
> >
> > futex(0x40e3ec18, FUTEX_WAIT, 14751, NULL <unfinished ...>
> >
> > Here is a gdb backtrace:
> >
> > (gdb) bt
> > #0  0x080501a1 in ber_memcalloc ()
> > #1  0x080696f5 in ch_calloc ()
> > #2  0x40556b5e in idl_alloc () from /usr/lib/ldap/back_ldbm.so
> > #3  0x40556b8b in idl_allids () from /usr/lib/ldap/back_ldbm.so
> > #4  0x40556c9a in idl_free () from /usr/lib/ldap/back_ldbm.so
> > #5  0x40557d7a in idl_delete_key () from /usr/lib/ldap/back_ldbm.so
> > #6  0x4055ce4f in dn2id_delete () from /usr/lib/ldap/back_ldbm.so
> > #7  0x40561853 in ldbm_back_delete () from /usr/lib/ldap/back_ldbm.so
> > #8  0x080684bb in do_delete ()
> > #9  0x08056517 in connection_done ()
> > #10 0x40026c58 in ldap_pvt_thread_pool_destroy () from
> > /usr/lib/libldap_r.so.2
> > #11 0x40166964 in start_thread () from /lib/tls/libpthread.so.0
> > #12 0x42857854 in ?? ()
> >
> > Also, even though I'm using this on Debian, the source that I
> > am using is the
> > official 2.1.25 (20031217) source without any of the Debian
> > GNU/TLS patches
> > applied.
> >
> 
>