7113 – broken read-only replica results in assertion failure and crash on master slapd process

Issue 7113 - broken read-only replica results in assertion failure and crash on master slapd process

Summary: broken read-only replica results in assertion failure and crash on master sla...

Status:	VERIFIED FIXED

Alias:	None

Product:	OpenLDAP
Classification:	Unclassified
Component:	slapd (show other issues)
Version:	2.4.28
Hardware:	All All

Importance:	--- normal
Target Milestone:	---
Assignee:	OpenLDAP project

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-12-19 19:18 UTC by kacarstensen@csupomona.edu
Modified:	2014-08-01 21:04 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description kacarstensen@csupomona.edu 2011-12-19 19:18:50 UTC

Full_Name: Kevan Carstensen
Version: 2.4.28
OS: gentoo linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (134.71.248.23)


During a heavy update load, replication from our master to one of our read-only
ldap replicas started failing due to locking errors on the replica.

Dec 14 04:01:08 pip-dev slapd[12645]: bdb(dc=csupomona,dc=edu): Lock table is
out of available lock entries
Dec 14 04:01:08 pip-dev slapd[12645]: => bdb_idl_delete_key: c_get failed:
Cannot allocate memory (12)
Dec 14 04:01:08 pip-dev slapd[12645]: conn=-1 op=0: attribute "memberUid" index
delete failure
Dec 14 04:01:08 pip-dev slapd[12645]: null_callback : error code 0x50
Dec 14 04:01:08 pip-dev slapd[12645]: syncrepl_entry: rid=001 be_modify failed
(80)
Dec 14 04:01:08 pip-dev slapd[12645]: do_syncrepl: rid=001 rc 80 retrying

(these are due to an incorrect DB_CONFIG on the replica and are not related to
this issue; I include them in case their impact on how/when the replica connects
to the master are important)

After about ten minutes of the replica trying and failing to start its syncrepl
again, the master slapd crashed. Nothing in the master slapd logs specifically
references the crash or indicates why it happened. Shortly after the replica
broke, some unusual messages started showing up in the logs on the master:

Dec 14 03:50:11 fosse slapd[22596]: connection_read(65): no connection!
Dec 14 03:50:22 fosse slapd[22596]: send_search_entry: conn 7535053  ber write
failed.
Dec 14 03:50:32 fosse slapd[22596]: send_search_entry: conn 7535061  ber write
failed.
Dec 14 03:50:42 fosse slapd[22596]: send_search_entry: conn 7535065  ber write
failed.
Dec 14 03:50:56 fosse slapd[22596]: send_search_entry: conn 7535068  ber write
failed.
Dec 14 03:51:06 fosse slapd[22596]: connection_read(65): no connection!
Dec 14 03:51:06 fosse slapd[22596]: connection_read(65): no connection!

These continued until the crash, persisted after the master slapd was
successfully restarted, and went away when I turned the broken replica off.

I found similar behavior in a development setup similar to our production setup.
There is one master, and there are two read-only replicas configured to
replicate from the master with the following syncrepl stanza:

syncrepl        rid=1
                provider=ldaps://master-2.ldap.csupomona.edu/
                type=refreshAndPersist
                retry="10 10 60 +"
                searchbase="dc=csupomona,dc=edu"
                bindmethod=simple
                binddn=cn=slapd-syncrepl,ou=user,ou=service,dc=csupomona,dc=edu
                credentials=__PASSWORD__

(there are two replicas because I couldn't reliably produce a crash with one
replica)

The replicas have a DB_CONFIG that causes their replication attempts to fail
with the same error as we saw in our production setup. The master database is
initialized so that the replicas have a lot of changes to catch up on, as was
the case when our production setup failed. Even with loglevel -1, there's
nothing specifically referencing the crash in the master slapd's logs. There's
an assertion failure in the backtrace:

slapd: daemon.c:942: slapd_clr_write: Assertion
`((slap_daemon[id].sd_index[(s)]) != -1)' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7fffea863700 (LWP 27623)]
0x00007ffff6a95175 in raise () from /lib64/libc.so.6
(gdb) bt full
#0  0x00007ffff6a95175 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007ffff6a96590 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x00007ffff6a8e201 in __assert_fail () from /lib64/libc.so.6
No symbol table info available.
#3  0x000000000042ec9b in slapd_clr_write (s=<value optimized out>, wake=0) at
daemon.c:942
        id = 0
        __PRETTY_FUNCTION__ = "slapd_clr_write"
#4  0x000000000043419b in connection_write (s=17) at connection.c:1896
        c = <value optimized out>
        op = <value optimized out>
        __PRETTY_FUNCTION__ = "connection_write"
#5  0x000000000043006a in slapd_daemon_task (ptr=<value optimized out>) at
daemon.c:2782
        rc = 1
        fd = 17
        w = 6
        ns = 1
        at = 0
        revents = <value optimized out>
        tvp = <value optimized out>
        cat = {tv_sec = 1324320980, tv_usec = 0}
        i = 0
        now = <value optimized out>
        tv = {tv_sec = 1135, tv_usec = 0}
        rtask = <value optimized out>
        l = <value optimized out>
        last_idle_check = 1324317380
        ebadf = 0
        tid = <value optimized out>
#6  0x00007ffff71ed914 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#7  0x00007ffff6b347dd in clone () from /lib64/libc.so.6
No symbol table info available.

The master and replicas in my development setup all run 2.4.28. The production
master runs 2.4.26. I successfully reproduced the issue on both 2.4.26 and
2.4.28, but only have a backtrace for 2.4.28. If you want a backtrace for
2.4.26, I can get one.

Thanks.

Comment 1 kacarstensen@csupomona.edu 2012-01-17 23:49:30 UTC

I've investigated this issue a little bit more since my initial bug
report.

With loglevel conns sync in slapd.conf, the following are the last
messages written to the log before slapd crashes:

  Jan 12 09:33:19 shelley slapd[2482]:  
  Jan 12 09:33:19 shelley slapd[2482]: daemon: write active on 15 
  Jan 12 09:33:19 shelley slapd[2482]: daemon: epoll: listen=9 active_threads=0 tvp=zero 
  Jan 12 09:33:19 shelley slapd[2482]: daemon: epoll: listen=10 active_threads=0 tvp=zero 
  Jan 12 09:33:19 shelley slapd[2482]: daemon: activity on 1 descriptor 
  Jan 12 09:33:19 shelley slapd[2482]: daemon: activity on: 
  Jan 12 09:33:19 shelley slapd[2482]:  15rw 
 
  Jan 12 09:33:19 shelley slapd[2482]: ber_flush2 failed errno=32 reason="Broken pipe" 
  Jan 12 09:33:19 shelley slapd[2482]: connection_closing: readying conn=1377 sd=15 for close 
  Jan 12 09:33:19 shelley slapd[2482]: send_search_entry: conn 1377  ber write failed. 
  Jan 12 09:33:19 shelley slapd[2482]: connection_resched: attempting closing conn=1377 sd=15 
  Jan 12 09:33:19 shelley slapd[2482]: daemon: removing 15 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]: connection_get(15): connection not used 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15): no connection! 
  Jan 12 09:33:19 shelley slapd[2482]: connection_read(15) error 
  Jan 12 09:33:19 shelley slapd[2482]:  
  Jan 12 09:33:19 shelley slapd[2482]: daemon: write active on 15

"daemon: write active on 15" comes from line 2660 or so of daemon.c.
Here's that with some context:

    for ( i = 0; nwfds > 0; i++ ) {
        ber_socket_t wd;
        if ( ! SLAP_EVENT_IS_WRITE( i ) ) continue;
        wd = i;

        SLAP_EVENT_CLR_WRITE( wd );
        nwfds--;

        Debug(LDAP_DEBUG_CONNS,
            "daemon: write active on %d\n",
            wd, 0, 0);

        /*
         * NOTE: it is possible that the connection was closed and that
         * the stream is now inactive.  connection_write() must validate
         * the stream is still active.
         *
         * ITS#4338: if the stream is invalid, there is no need to close it
         * here.  It has already been closed in connection.c.
         */
        if ( connection_write( wd ) < 0 ) {
            if ( SLAP_EVENT_IS_READ( wd)) {
                SLAP_EVENT_CLR_READ( (unsigned) wd);
                nrfds--;
            }
        }
    }

We do some housekeeping on the descriptor, then call connection_write to
take care of the write. Here's connection_write from connection.c:

  int connection_write(ber_socket_t s)
  {
      Connection *c;
      Operation *op;
  
      assert( connections != NULL );
  
      slapd_clr_write( s, 0 );

      c = connection_get( s );
      if( c == NULL ) {
          Debug( LDAP_DEBUG_ANY,
              "connection_write(%ld): no connection!\n",
              (long)s, 0, 0 );
          return -1;
      }
      [...]

connection_get seems to perform the validation that is mentioned by the
comment in daemon.c, but connection_get is called after slapd_clr_write,
which is the source of the assertion implicated by the backtrace in the
initial bug report:

  void
  slapd_clr_write( ber_socket_t s, int wake )
  {
    int id = DAEMON_ID(s);
    ldap_pvt_thread_mutex_lock( &slap_daemon[id].sd_mutex );
  
    if ( SLAP_SOCK_IS_WRITE( id, s )) {
        assert( SLAP_SOCK_IS_ACTIVE( id, s ));
  
        SLAP_SOCK_CLR_WRITE( id, s );
        slap_daemon[id].sd_nwriters--;
    }
  
    ldap_pvt_thread_mutex_unlock( &slap_daemon[id].sd_mutex );
    WAKE_LISTENER(id,wake);
  }

I'm not sure if connection_write is supposed to validate that a stream
is active before or after calling slapd_clr_write, but it seems like the
assertion wouldn't be an issue if that validation were performed before
calling slapd_clr_write. To test this thought, I rebuilt openldap 2.4.28
with the following patch:

  --- openldap-2.4.28/servers/slapd/connection.c    2011-11-25 10:52:29.000000000 -0800
  +++ openldap-2.4.28-new/servers/slapd/connection.c    2012-01-12 13:35:45.000000000 -0800
  @@ -1893,8 +1893,6 @@
   
    assert( connections != NULL );
   
  - slapd_clr_write( s, 0 );
  -
    c = connection_get( s );
    if( c == NULL ) {
        Debug( LDAP_DEBUG_ANY,
  @@ -1903,6 +1901,8 @@
        return -1;
    }
   
  + slapd_clr_write( s, 0 );
  +
   #ifdef HAVE_TLS
    if ( c->c_is_tls && c->c_needs_tls_accept ) {
        connection_return( c );

and tried to reproduce the problem under the same circumstances as
reported in my initial bug report. The master slapd tolerated the
misconfigured replicas for 5 days without crashing; before, it would
crash reliably within a half hour or so. I didn't notice any regressions
due to the patch, though the master slapd wasn't exposed to a typical
workload during the experiment.

Any thoughts on this patch?
-- 
Kevan Carstensen                        <kacarstensen@csupomona.edu>
Operating Systems Analyst, I&IT Systems, Cal Poly Pomona

Comment 2 Howard Chu 2012-01-22 01:28:43 UTC

kacarstensen@csupomona.edu wrote:
> I've investigated this issue a little bit more since my initial bug
> report.

> I'm not sure if connection_write is supposed to validate that a stream
> is active before or after calling slapd_clr_write, but it seems like the
> assertion wouldn't be an issue if that validation were performed before
> calling slapd_clr_write. To test this thought, I rebuilt openldap 2.4.28
> with the following patch:
>
>    --- openldap-2.4.28/servers/slapd/connection.c    2011-11-25 10:52:29.000000000 -0800
>    +++ openldap-2.4.28-new/servers/slapd/connection.c    2012-01-12 13:35:45.000000000 -0800
>    @@ -1893,8 +1893,6 @@
>
>      assert( connections != NULL );
>
>    - slapd_clr_write( s, 0 );
>    -
>      c = connection_get( s );
>      if( c == NULL ) {
>          Debug( LDAP_DEBUG_ANY,
>    @@ -1903,6 +1901,8 @@
>          return -1;
>      }
>
>    + slapd_clr_write( s, 0 );
>    +
>     #ifdef HAVE_TLS
>      if ( c->c_is_tls&&  c->c_needs_tls_accept ) {
>          connection_return( c );
>
> and tried to reproduce the problem under the same circumstances as
> reported in my initial bug report. The master slapd tolerated the
> misconfigured replicas for 5 days without crashing; before, it would
> crash reliably within a half hour or so. I didn't notice any regressions
> due to the patch, though the master slapd wasn't exposed to a typical
> workload during the experiment.
>
> Any thoughts on this patch?

Sounds OK to me, committed to git master. Thanks.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 3 Howard Chu 2012-01-22 01:39:58 UTC

changed notes
changed state Open to Test
moved from Incoming to Software Bugs

Comment 4 Quanah Gibson-Mount 2012-01-23 19:29:08 UTC

changed notes
changed state Test to Release

Comment 5 Quanah Gibson-Mount 2012-02-13 19:06:04 UTC

changed notes
changed state Release to Closed

Comment 6 OpenLDAP project 2014-08-01 21:04:40 UTC

fixed in master
fixed in RE24