[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7723) slapd crashes on multi core machines if a search request is *immediately* followed by an unbind



On 10/15/2013 02:03 PM, hyc@symas.com wrote:
> Yes, your test config shows that the problem is that rwm_conn_destroy frees 
> the session context while rwm_op_search is using it. Seems like the rwm 
> overlay is not doing reference counting properly.
> 

With the current master (fe49824f83bb0f2dd2f543c26f186b516c1d75bd), this is how
the trace looks like:

(gdb) t apply all bt

Thread 5 (Thread 0x7f53f9009700 (LWP 28210)):
#0  0x000000309360dded in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003093609ba1 in _L_lock_790 () from /lib64/libpthread.so.0
#2  0x0000003093609aa7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x00007f5401b530a7 in ldap_pvt_thread_mutex_lock (mutex=0x7f53f980c768) at
thr_posix.c:296
#4  0x0000000000445287 in connection_operation (ctx=0x7f53f9008bb0,
arg_v=0x7f53e0001230) at connection.c:1147
#5  0x00007f5401b51993 in ldap_int_thread_pool_wrapper (xpool=0x937380) at
tpool.c:945
#6  0x0000003093607c53 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003092ef5e1d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f53f3fff700 (LWP 28215)):
#0  0x000000309360b575 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x00007f5401b53054 in ldap_pvt_thread_cond_wait (cond=0x9373b8,
mutex=0x937390) at thr_posix.c:277
#2  0x00007f5401b518be in ldap_int_thread_pool_wrapper (xpool=0x937380) at
tpool.c:927
#3  0x0000003093607c53 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003092ef5e1d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f54016dd740 (LWP 28178)):
#0  0x0000003093608d97 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f5401b52f95 in ldap_pvt_thread_join (thread=139998644971264,
thread_return=0x0) at thr_posix.c:197
#2  0x0000000000441bb4 in slapd_daemon () at daemon.c:2907
#3  0x000000000041b28a in main (argc=10, argv=0x7fff84c32388) at main.c:1013

Thread 2 (Thread 0x7f53f980a700 (LWP 28197)):
#0  0x0000003092ef63f3 in epoll_wait () from /lib64/libc.so.6
#1  0x00000000004406b4 in slapd_daemon_task (ptr=0x9c02d0) at daemon.c:2514
#2  0x0000003093607c53 in start_thread () from /lib64/libpthread.so.0
#3  0x0000003092ef5e1d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f53f8808700 (LWP 28214)):
#0  0x00000000004c121a in slap_sl_free (ptr=0x7f53e8000b40, ctx=0x7f53e80009d0)
at sl_malloc.c:513
#1  0x0000000000449520 in do_search (op=0x7f53e0103c60, rs=0x7f53f8807ad0) at
search.c:257
#2  0x00000000004451d4 in connection_operation (ctx=0x7f53f8807bb0,
arg_v=0x7f53e0103c60) at connection.c:1134
#3  0x00007f5401b51993 in ldap_int_thread_pool_wrapper (xpool=0x937380) at
tpool.c:945
#4  0x0000003093607c53 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003092ef5e1d in clone () from /lib64/libc.so.6


(gdb) bt full
#0  0x00000000004c121a in slap_sl_free (ptr=0x7f53e8000b40, ctx=0x7f53e80009d0)
at sl_malloc.c:513
        sh = 0x7f53e80009d0
        size = 7286935691776455520
        p = 0x7f53e8000b38
        nextp = 0x6520e4bb49737e98
        tmpp = 0x7f53f8807ad0
#1  0x0000000000449520 in do_search (op=0x7f53e0103c60, rs=0x7f53f8807ad0) at
search.c:257
        base = {bv_len = 6, bv_val = 0x7f53e00009c7 "o=test"}
        siz = 0
        off = 0
        i = 0
#2  0x00000000004451d4 in connection_operation (ctx=0x7f53f8807bb0,
arg_v=0x7f53e0103c60) at connection.c:1134
        rc = 80
        cancel = 0
        op = 0x7f53e0103c60
        rs = {sr_type = REP_RESULT, sr_tag = 101, sr_msgid = 2, sr_err = 80,
sr_matched = 0x0, sr_text = 0x7f53f98d4fe6 "searchDN massage error", sr_ref =
0x0, sr_ctrls = 0x0, sr_un = {sru_search = {r_entry = 0x0, r_attr_flags = 0,
              r_operational_attrs = 0x0, r_attrs = 0x0, r_nentries = 0, r_v2ref
= 0x0}, sru_sasl = {r_sasldata = 0x0}, sru_extended = {r_rspoid = 0x0, r_rspdata
= 0x0}}, sr_flags = 0}
        tag = 99
        opidx = SLAP_OP_SEARCH
        conn = 0x7f53f980c750
        memctx = 0x7f53e80009d0
        memctx_null = 0x0
        memsiz = 1048576
        __PRETTY_FUNCTION__ = "connection_operation"
#3  0x00007f5401b51993 in ldap_int_thread_pool_wrapper (xpool=0x937380) at
tpool.c:945
        pq = 0x937380
        pool = 0x937280
        task = 0x7f53ec113d60
        work_list = 0x9373f0
        ctx = {ltu_pq = 0x937380, ltu_id = 139998628185856, ltu_key = {{ltk_key
= 0x444c86 <conn_counter_init>, ltk_data = 0x7f53e80008c0, ltk_free = 0x444a7d
<conn_counter_destroy>}, {ltk_key = 0x4c0182 <slap_sl_mem_init>,
              ltk_data = 0x7f53e80009d0, ltk_free = 0x4bffa7
<slap_sl_mem_destroy>}, {ltk_key = 0x461f28 <slap_op_free>, ltk_data = 0x0,
ltk_free = 0x461e7b <slap_op_q_destroy>}, {ltk_key = 0x0, ltk_data = 0x0,
              ltk_free = 0x0} <repeats 29 times>}}
        kctx = 0x0
        i = 32
        keyslot = 817
        hash = 1596318513
        pool_lock = 0
        freeme = 0
        __PRETTY_FUNCTION__ = "ldap_int_thread_pool_wrapper"


I used the reproducer provided by Michael. The core was generated by the server
running with slapd1.conf.

-- 
Jan Synacek
Software Engineer, Red Hat