[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7037) slapd crashes with syncrepl when replica cleared out DB



ml+openldap@esmtp.org wrote:
> Full_Name: Claus Assmann
> Version: 2.4.26
> OS: Red Hat Enterprise Linux Server release 5.5
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (63.211.143.38)
>
>
> While trying to determine how syncrepl behaves in various error
> conditions, I encountered a crash of slapd.  A MASTER is set up to
> use push replication to a REPLICA as follows:
>
> database        ldap
> hidden          on
> suffix          ""
> rootdn          "cn=slapd-ldap"
> uri             ldap://REPLICA/
> lastmod         on
> restrict        all
> sync_use_subentry       true
>
> acl-bind        bindmethod=simple
>                  binddn="cn=Monitor"
>                  credentials=password
>
> syncrepl        rid=001
>                  provider=ldapi://%2Fvar%2Frun%2Fldapi
>                  binddn="cn=Manager"
>                  bindmethod=simple
>                  credentials=passwd
>                  searchbase=""
>                  type=refreshAndPersist
>                  retry="5 5 60 +"
>
> Reproduce:
> On REPLICA: stop slapd, clear out directory for DB, start
> slapd.

> On MASTER:
> add an entry master which has to be "synced" to REPLICA;
> run ldapsearch to lookup up that entry on MASTER
> slapd dumps core, backtrace:

Thanks for the report. There were two bugs here:
  1) the syncrepl_add_glue_ancestors was not constructing a valid DN for the 
empty-suffix case.
  2) it shouldn't have tried to add glue entries in this case anyway, since 
the underlying DB was pulled out from under it.

Both fixed now in master, please test.

> #0  0x00002b7e51344265 in raise () from /lib64/libc.so.6
> #1  0x00002b7e51345d10 in abort () from /lib64/libc.so.6
> #2  0x00002b7e5133d6e6 in __assert_fail () from /lib64/libc.so.6
> #3  0x0000000000549546 in ldap_add_ext (ld=0xbba1bc0, dn=0x0, attrs=0xbeb7ed0,
>      sctrls=0x0, cctrls=0x0, msgidp=0x435224c4) at add.c:126
> #4  0x00000000004dbe6d in ldap_back_add (op=0x43523150, rs=0x43522770)
>      at add.c:102
> #5  0x0000000000481152 in overlay_op_walk (op=0x43523150, rs=0x43522770,
>      which=op_add, oi=0xb9eb6c0, on=0x0) at backover.c:671
> #6  0x00000000004816a7 in over_op_func (op=0x43523150, rs=0x43522770,
>      which=op_add) at backover.c:723
> #7  0x0000000000473706 in syncrepl_add_glue_ancestors (op=0x43523150,
>      e=0x2b7e56628fb8) at syncrepl.c:3149
> #8  0x000000000047384e in syncrepl_add_glue (op=0x43523150, e=0x2b7e56628fb8)
>      at syncrepl.c:3193
> #9  0x0000000000474795 in syncrepl_entry (si=0xb9eb1c0, op=0x43523150,
>      entry=0x0, modlist=0x43523ca0, syncstate=<value optimized out>,
>      syncUUID=<value optimized out>, syncCSN=0xbeb86e0) at syncrepl.c:2448
> #10 0x000000000047c39b in do_syncrep2 (ctx=<value optimized out>,
>      arg=<value optimized out>) at syncrepl.c:982
> #11 do_syncrepl (ctx=<value optimized out>, arg=<value optimized out>)
>      at syncrepl.c:1489
> #12 0x000000000041eee3 in connection_read_thread (ctx=0x43523da0,
>      argv=<value optimized out>) at connection.c:1276
> #13 0x00000000005412dc in ldap_int_thread_pool_wrapper (xpool=0xb8fa1c0)
>      at tpool.c:685
> #14 0x00002b7e510ff73d in start_thread () from /lib64/libpthread.so.0
> #15 0x00002b7e513e84bd in clone () from /lib64/libc.so.6
>
> Note: this also happens with 2.4.23. On a VM instance, it causes
> a different error (see mail on list: "killed after 120 seconds")
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/