OpenLDAP
Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest

Viewing Software Bugs/6089
Full headers

From: jwm@horde.net
Subject: connection_destroy() c->c_writers == 0 assertion failure
Compose comment
Download message
State:
0 replies:
5 followups: 1 2 3 4 5

Major security issue: yes  no

Notes:

Notification:


Date: Fri, 01 May 2009 21:48:43 +0000
From: jwm@horde.net
To: openldap-its@OpenLDAP.org
Subject: connection_destroy() c->c_writers == 0 assertion failure
Full_Name: John Morrissey
Version: 2.4.16
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (2001:4978:194:0:21f:5bff:fee9:da92)


After about 7 days of uptime, slapd catches a SIGABRT and dies. The failed
assertion is in connection_close():

    assert( c->c_writers == 0);

Backtrace is below. I also have a ~1.7GB core if it's deemed useful; I'll keep
it around for a week or two. This is with BDB 4.7+all three patches.

GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
Attaching to program: /usr/sbin/slapd, process 9365
Reading symbols from /usr/lib/libtcmalloc.so.0...done.
Loaded symbols for /usr/lib/libtcmalloc.so.0
Reading symbols from /usr/lib/libprofiler.so.0...done.
Loaded symbols for /usr/lib/libprofiler.so.0
Reading symbols from /usr/lib/libldap_r-2.4.so.2...Reading symbols from
/usr/lib/debug/usr/lib/libldap_r-2.4.so.2.4.2...done.
done.
Loaded symbols for /usr/lib/libldap_r-2.4.so.2
Reading symbols from /usr/lib/liblber-2.4.so.2...Reading symbols from
/usr/lib/debug/usr/lib/liblber-2.4.so.2.4.2...done.
done.
Loaded symbols for /usr/lib/liblber-2.4.so.2
Reading symbols from /usr/lib/libdb-4.7.so...done.
Loaded symbols for /usr/lib/libdb-4.7.so
Reading symbols from /usr/lib/libodbc.so.1...done.
Loaded symbols for /usr/lib/libodbc.so.1
Reading symbols from /usr/lib/libslp.so.1...done.
Loaded symbols for /usr/lib/libslp.so.1
Reading symbols from /usr/lib/libsasl2.so.2...done.
Loaded symbols for /usr/lib/libsasl2.so.2
Reading symbols from /usr/lib/libgnutls.so.26...Reading symbols from
/usr/lib/debug/usr/lib/libgnutls.so.26.4.6...done.
done.
Loaded symbols for /usr/lib/libgnutls.so.26
Reading symbols from /usr/lib/debug/libcrypt.so.1...done.
Loaded symbols for /usr/lib/debug/libcrypt.so.1
Reading symbols from /usr/lib/debug/libresolv.so.2...done.
Loaded symbols for /usr/lib/debug/libresolv.so.2
Reading symbols from /usr/lib/libltdl.so.3...done.
Loaded symbols for /usr/lib/libltdl.so.3
Reading symbols from /usr/lib/debug/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0xb78648e0 (LWP 9365)]
[New Thread 0x60e2eb90 (LWP 9371)]
[New Thread 0x6122eb90 (LWP 9370)]
[New Thread 0x61644b90 (LWP 9369)]
[New Thread 0x61a54b90 (LWP 9368)]
[New Thread 0x61e54b90 (LWP 9367)]
Loaded symbols for /usr/lib/debug/libpthread.so.0
Reading symbols from /usr/lib/debug/libc.so.6...done.
Loaded symbols for /usr/lib/debug/libc.so.6
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /usr/lib/debug/libm.so.6...done.
Loaded symbols for /usr/lib/debug/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /usr/lib/debug/libnsl.so.1...done.
Loaded symbols for /usr/lib/debug/libnsl.so.1
Reading symbols from /usr/lib/debug/libdl.so.2...done.
Loaded symbols for /usr/lib/debug/libdl.so.2
Reading symbols from /usr/lib/libtasn1.so.3...done.
Loaded symbols for /usr/lib/libtasn1.so.3
Reading symbols from /usr/lib/libgpg-error.so.0...done.
Loaded symbols for /usr/lib/libgpg-error.so.0
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /usr/lib/libgcrypt.so.11...done.
Loaded symbols for /usr/lib/libgcrypt.so.11
Reading symbols from /lib/ld-linux.so.2...Reading symbols from
/usr/lib/debug/lib/ld-2.7.so...done.
done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/debug/libnss_files.so.2...done.
Loaded symbols for /usr/lib/debug/libnss_files.so.2
Reading symbols from /usr/lib/debug/libnss_compat.so.2...done.
Loaded symbols for /usr/lib/debug/libnss_compat.so.2
Reading symbols from /usr/lib/debug/libnss_nis.so.2...done.
Loaded symbols for /usr/lib/debug/libnss_nis.so.2
Reading symbols from /lib/libnss_ldap.so.2...done.
Loaded symbols for /lib/libnss_ldap.so.2
Reading symbols from /usr/lib/libkrb5.so.3...done.
Loaded symbols for /usr/lib/libkrb5.so.3
Reading symbols from /lib/libcom_err.so.2...done.
Loaded symbols for /lib/libcom_err.so.2
Reading symbols from /usr/lib/libgssapi_krb5.so.2...done.
Loaded symbols for /usr/lib/libgssapi_krb5.so.2
Reading symbols from /usr/lib/libk5crypto.so.3...done.
Loaded symbols for /usr/lib/libk5crypto.so.3
Reading symbols from /usr/lib/libkrb5support.so.0...done.
Loaded symbols for /usr/lib/libkrb5support.so.0
Reading symbols from /lib/libkeyutils.so.1...done.
Loaded symbols for /lib/libkeyutils.so.1
Reading symbols from /usr/lib/sasl2/libsasldb.so.2...done.
Loaded symbols for /usr/lib/sasl2/libsasldb.so.2
Reading symbols from /usr/lib/libdb-

Message of length 39073 truncated

Followup 1

Download message
Date: Tue, 5 May 2009 12:22:58 -0400
From: John Morrissey <jwm@horde.net>
To: openldap-its@OpenLDAP.org
Subject: Re: (ITS#6089) connection_destroy() c->c_writers == 0 assertion
	failure
Poked around a bit in the core:

(gdb) print c->c_writers
$1 = -1
(gdb) print c->c_pending_ops->stqh_last[0]
$5 = (struct Operation *) 0x0
(gdb) print c->c_n_ops_pending
$6 = 0

So there are no pending ops on this connection, but c_writers == -1
indicates that one blocked writer remains to notice that the connection has
been closed.

I looked at send_ldap_ber() and can't immediately find fault with its
manipulation of c_writers. connection_closing() wakes up all blocked
writers, so by the time connection_close() calls connection_destroy() and
this assertion is checked, c_writers should be 0. Furthermore,
c->c_conn_state is SLAP_C_CLOSING (0x4), which can only happen in
connection_closing().

This is about as far as I can get; any other ideas?

john


(gdb) print *c
$2 = {c_struct_state = 2, c_conn_state = 4, c_conn_idx = 1878, c_sd = 1878, 
  c_close_reason = 0x0, c_mutex = {__data = {__lock = 1, __count = 0, 
      __owner = 9540, __kind = 0, __nusers = 1, {__spins = 0, __list = {
          __next = 0x0}}}, 
    __size = "\001\000\000\000\000\000\000\000D%\000\000\000\000\000\000\001\000\000\000\000\000\000",
__align = 1}, c_sb = 0x175c7240, 
  c_starttime = 1240942246, c_activitytime = 1240942264, c_connid = 73097614, 
  c_peer_domain = {bv_len = 7, bv_val = 0x137e02c8 "unknown"}, c_peer_name = {
    bv_len = 23, bv_val = 0x191c0be0 "IP=172.16.182.103:50023"}, 
  c_listener = 0x9bb7370, c_sasl_bind_mech = {bv_len = 0, bv_val = 0x0}, 
  c_sasl_dn = {bv_len = 0, bv_val = 0x0}, c_sasl_authz_dn = {bv_len = 0, 
    bv_val = 0x0}, c_authz_backend = 0x9bba500, c_authz_cookie = 0x0, 
  c_authz = {sai_method = 128, sai_mech = {bv_len = 0, bv_val = 0x0}, 
    sai_dn = {bv_len = 27, bv_val = 0x1876bd20 "cn=nsspluspasswd,o=example2"}, 
    sai_ndn = {bv_len = 27, 
      bv_val = 0x189f5780 "cn=nsspluspasswd,o=example2"}, sai_ssf = 128, 
    sai_transport_ssf = 0, sai_tls_ssf = 128, sai_sasl_ssf = 0}, 
  c_protocol = 3, c_ops = {stqh_first = 0x0, stqh_last = 0x9d7df18}, 
  c_pending_ops = {stqh_first = 0x0, stqh_last = 0x9d7df20}, c_write1_mutex = {
    __data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 0, {
        __spins = 0, __list = {__next = 0x0}}}, 
    __size = '\0' <repeats 23 times>, __align = 0}, c_write1_cv = {__data
= {
      __lock = 0, __futex = 2, __total_seq = 1, __wakeup_seq = 1, 
      __woken_seq = 1, __mutex = 0x9d7df28, __nwaiters = 0, 
      __broadcast_seq = 1}, 
    __size = "\000\000\000\000\002\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000(##\t\000\000\000\000\001\000\000\000\000\000\000",
__align = 8589934592}, c_write2_mutex = {
    __data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 0, {
        __spins = 0, __list = {__next = 0x0}}}, 
    __size = '\0' <repeats 23 times>, __align = 0}, c_write2_cv = {__data
= {
      __lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0, 
      __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, 
    __size = '\0' <repeats 47 times>, __align = 0}, c_currentber = 0x0, 
  c_writers = -1, c_sasl_bind_in_progress = 0 '\0', c_writewaiter = 0 '\0', 
  c_is_tls = 1 '\001', c_needs_tls_accept = 0 '\0', c_sasl_layers = 0 '\0', 
  c_sasl_done = 0 '\0', c_sasl_authctx = 0x13497200, c_sasl_sockctx = 0x0, 
  c_sasl_extra = 0x14d956c0, c_sasl_bindop = 0x0, c_pagedresults_state = {
    ps_be = 0x0, ps_size = 0, ps_count = 0, ps_cookie = 0, ps_cookieval = {
      bv_len = 0, bv_val = 0x0}}, c_n_ops_received = 6, c_n_ops_executing = 0, 
  c_n_ops_pending = 0, c_n_ops_completed = 6, c_n_get = 4, c_n_read = 4, 
  c_n_write = 0, c_extensions = 0x0, c_clientfunc = 0, c_clientarg = 0x0, 
  c_send_ldap_result = 0x808aea0 <slap_send_ldap_result>, 
  c_send_search_entry = 0x80885e0 <slap_send_search_entry>, 
  c_send_search_reference = 0x8087da0 <slap_send_search_reference>, 
  c_send_ldap_extended = 0x808a910 <slap_send_ldap_extended>, 
  c_send_ldap_intermediate = 0x808a760 <slap_send_ldap_intermediate>}

-- 
John Morrissey          _o            /\         ----  __o
jwm@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__



Followup 2

Download message
Date: Wed, 06 May 2009 02:18:30 -0700
From: Howard Chu <hyc@symas.com>
To: jwm@horde.net
CC: openldap-its@openldap.org
Subject: Re: (ITS#6089) connection_destroy() c->c_writers == 0 assertion	failure
jwm@horde.net wrote:
> Poked around a bit in the core:
>
> (gdb) print c->c_writers
> $1 = -1
> (gdb) print c->c_pending_ops->stqh_last[0]
> $5 = (struct Operation *) 0x0
> (gdb) print c->c_n_ops_pending
> $6 = 0
>
> So there are no pending ops on this connection, but c_writers == -1
> indicates that one blocked writer remains to notice that the connection has
> been closed.
>
> I looked at send_ldap_ber() and can't immediately find fault with its
> manipulation of c_writers. connection_closing() wakes up all blocked
> writers, so by the time connection_close() calls connection_destroy() and
> this assertion is checked, c_writers should be 0. Furthermore,
> c->c_conn_state is SLAP_C_CLOSING (0x4), which can only happen in
> connection_closing().
>
> This is about as far as I can get; any other ideas?

I'm stumped. The only way out of connection_closing() is for c_writers to go 
to zero. And from your core file, there are no other threads blocked on any of 
the write mutexes or condition variables. You can reproduce this easily?

> john
>
>
> (gdb) print *c
> $2 = {c_struct_state = 2, c_conn_state = 4, c_conn_idx = 1878, c_sd = 1878,
>    c_close_reason = 0x0, c_mutex = {__data = {__lock = 1, __count = 0,
>        __owner = 9540, __kind = 0, __nusers = 1, {__spins = 0, __list = {
>            __next = 0x0}}},
>      __size = "\001\000\000\000\000\000\000\000D%\000\000\000\000\000\000\001\000\000\000\000\000\000",
__align = 1}, c_sb = 0x175c7240,
>    c_starttime = 1240942246, c_activitytime = 1240942264, c_connid =
73097614,
>    c_peer_domain = {bv_len = 7, bv_val = 0x137e02c8 "unknown"}, c_peer_name
= {
>      bv_len = 23, bv_val = 0x191c0be0 "IP=172.16.182.103:50023"},
>    c_listener = 0x9bb7370, c_sasl_bind_mech = {bv_len = 0, bv_val = 0x0},
>    c_sasl_dn = {bv_len = 0, bv_val = 0x0}, c_sasl_authz_dn = {bv_len = 0,
>      bv_val = 0x0}, c_authz_backend = 0x9bba500, c_authz_cookie = 0x0,
>    c_authz = {sai_method = 128, sai_mech = {bv_len = 0, bv_val = 0x0},
>      sai_dn = {bv_len = 27, bv_val = 0x1876bd20
"cn=nsspluspasswd,o=example2"},
>      sai_ndn = {bv_len = 27,
>        bv_val = 0x189f5780 "cn=nsspluspasswd,o=example2"}, sai_ssf = 128,
>      sai_transport_ssf = 0, sai_tls_ssf = 128, sai_sasl_ssf = 0},
>    c_protocol = 3, c_ops = {stqh_first = 0x0, stqh_last = 0x9d7df18},
>    c_pending_ops = {stqh_first = 0x0, stqh_last = 0x9d7df20},
c_write1_mutex = {
>      __data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers =
0, {
>          __spins = 0, __list = {__next = 0x0}}},
>      __size = '\0'<repeats 23 times>, __align = 0}, c_write1_cv =
{__data = {
>        __lock = 0, __futex = 2, __total_seq = 1, __wakeup_seq = 1,
>        __woken_seq = 1, __mutex = 0x9d7df28, __nwaiters = 0,
>        __broadcast_seq = 1},
>      __size = "\000\000\000\000\002\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000(##\t\000\000\000\000\001\000\000\000\000\000\000",
__align = 8589934592}, c_write2_mutex = {
>      __data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers =
0, {
>          __spins = 0, __list = {__next = 0x0}}},
>      __size = '\0'<repeats 23 times>, __align = 0}, c_write2_cv =
{__data = {
>        __lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 0,
>        __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq =
0},
>      __size = '\0'<repeats 47 times>, __align = 0}, c_currentber =
0x0,
>    c_writers = -1, c_sasl_bind_in_progress = 0 '\0', c_writewaiter = 0
'\0',
>    c_is_tls = 1 '\001', c_needs_tls_accept = 0 '\0', c_sasl_layers = 0
'\0',
>    c_sasl_done = 0 '\0', c_sasl_authctx = 0x13497200, c_sasl_sockctx = 0x0,
>    c_sasl_extra = 0x14d956c0, c_sasl_bindop = 0x0, c_pagedresults_state = {
>      ps_be = 0x0, ps_size = 0, ps_count = 0, ps_cookie = 0, ps_cookieval =
{
>        bv_len = 0, bv_val = 0x0}}, c_n_ops_received = 6, c_n_ops_executing
= 0,
>    c_n_ops_pending = 0, c_n_ops_completed = 6, c_n_get = 4, c_n_read = 4,
>    c_n_write = 0, c_extensions = 0x0, c_clientfunc = 0, c_clientarg = 0x0,
>    c_send_ldap_result = 0x808aea0<slap_send_ldap_result>,
>    c_send_search_entry = 0x80885e0<slap_send_search_entry>,
>    c_send_search_reference = 0x8087da0<slap_send_search_reference>,
>    c_send_ldap_extended = 0x808a910<slap_send_ldap_extended>,
>    c_send_ldap_intermediate = 0x808a760<slap_send_ldap_intermediate>}
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/



Followup 3

Download message
Date: Thu, 7 May 2009 20:12:25 -0400
From: John Morrissey <jwm@horde.net>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@openldap.org
Subject: Re: (ITS#6089) connection_destroy() c->c_writers == 0 assertion
	failure
On Wed, May 06, 2009 at 02:18:30AM -0700, Howard Chu wrote:
> jwm@horde.net wrote:
>> Poked around a bit in the core:
>>
>> (gdb) print c->c_writers
>> $1 = -1
>> (gdb) print c->c_pending_ops->stqh_last[0]
>> $5 = (struct Operation *) 0x0
>> (gdb) print c->c_n_ops_pending
>> $6 = 0
>>
>> So there are no pending ops on this connection, but c_writers == -1
>> indicates that one blocked writer remains to notice that the connection
has
>> been closed.
>>
>> I looked at send_ldap_ber() and can't immediately find fault with its
>> manipulation of c_writers. connection_closing() wakes up all blocked
>> writers, so by the time connection_close() calls connection_destroy()
and
>> this assertion is checked, c_writers should be 0. Furthermore,
>> c->c_conn_state is SLAP_C_CLOSING (0x4), which can only happen in
>> connection_closing().
>>
>> This is about as far as I can get; any other ideas?
>
> I'm stumped. The only way out of connection_closing() is for c_writers to 
> go to zero. And from your core file, there are no other threads blocked 
> on any of the write mutexes or condition variables. You can reproduce 
> this easily?

It's happened once or twice, but it hasn't had much time since this is the
same machine I just got done chasing the i386 heap fragmentation on. Our
backtrace/core/BDB harness is still on this slapd, so we should have
information if it happens again.

john
-- 
John Morrissey          _o            /\         ----  __o
jwm@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__



Followup 4

Download message
Date: Tue, 30 Jun 2009 18:05:09 -0700
From: Howard Chu <hyc@symas.com>
To: John Morrissey <jwm@horde.net>
CC: openldap-its@openldap.org
Subject: Re: (ITS#6089) connection_destroy() c->c_writers == 0 assertion	failure
John Morrissey wrote:
> It's happened once or twice, but it hasn't had much time since this is the
> same machine I just got done chasing the i386 heap fragmentation on. Our
> backtrace/core/BDB harness is still on this slapd, so we should have
> information if it happens again.

I was able to reproduce this today, a fix is now in HEAD. Please test, thanks.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/



Followup 5

Download message
Date: Tue, 30 Jun 2009 22:27:38 -0400
From: John Morrissey <jwm@horde.net>
To: Howard Chu <hyc@symas.com>
Cc: openldap-its@openldap.org
Subject: Re: (ITS#6089) connection_destroy() c->c_writers == 0 assertion
	failure
On Tue, Jun 30, 2009 at 06:05:09PM -0700, Howard Chu wrote:
> John Morrissey wrote:
> > It's happened once or twice, but it hasn't had much time since this is
> > the same machine I just got done chasing the i386 heap fragmentation
on.
> > Our backtrace/core/BDB harness is still on this slapd, so we should
have
> > information if it happens again.
>
> I was able to reproduce this today, a fix is now in HEAD. Please test,
> thanks.

Sorry Howard, we've retired our i386 machines and haven't seen this with
RE24 as of 20090511 on amd64.

john
-- 
John Morrissey          _o            /\         ----  __o
jwm@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__


Up to top level
Build   Contrib   Development   Documentation   Historical   Incoming   Software Bugs   Software Enhancements   Web  

Logged in as guest


The OpenLDAP Issue Tracking System uses a hacked version of JitterBug

______________
© Copyright 2013, OpenLDAP Foundation, info@OpenLDAP.org