Issue 8541 - test062 periodically core dumps
Summary: test062 periodically core dumps
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: 2.5.1
Assignee: Howard Chu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-08 23:57 UTC by Quanah Gibson-Mount
Modified: 2021-02-08 17:52 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Quanah Gibson-Mount 2016-12-08 23:57:42 UTC
Full_Name: Quanah Gibson-Mount
Version: HEAD
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (47.208.148.26)


When running test062, it sometimes core dumps in syncprov abandon.  I will
directly email the backtrace as the ITS software breaks them horribly.
Comment 1 Quanah Gibson-Mount 2016-12-08 23:58:28 UTC
moved from Incoming to Software Bugs
Comment 2 Quanah Gibson-Mount 2016-12-09 00:01:26 UTC
--On Thursday, December 08, 2016 11:57 PM +0000 quanah@openldap.org wrote:

> Full_Name: Quanah Gibson-Mount
> Version: HEAD
> OS: Linux
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (47.208.148.26)
>
>
> When running test062, it sometimes core dumps in syncprov abandon.  I will
> directly email the backtrace as the ITS software breaks them horribly.

build@c7build:~$ gdb 
~/git/symas-packages/thirdparty/openldap/build/RHEL7_64/symas-openldap/rpm/BUILD/openldap-2.4.45/servers/slapd/.libs/lt-slapd 
/tmp/core-lt-slapd-11-503-503-4174-1481240987
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from 
/home/build/git/symas-packages/thirdparty/openldap/build/RHEL7_64/symas-openldap/rpm/BUILD/openldap-2.4.45/servers/slapd/.libs/lt-slapd...done.
[New LWP 4264]
[New LWP 4189]
[New LWP 4174]
[New LWP 4204]
[New LWP 4265]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by 
`/home/build/git/symas-packages/thirdparty/openldap/build/RHEL7_64/symas-openlda'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f799db90b7b in syncprov_op_abandon (op=0x7f799db898c0, 
rs=0x7f799db89700) at syncprov.c:1154
1154                    if ( so->s_op->o_connid == op->o_connid &&
(gdb) thr apply all bt full
Thread 5 (Thread 0x7f799d389700 (LWP 4265)):
#0  0x00007f79a35dacb1 in clone () from /lib64/libc.so.6
No symbol table info available.
#1  0x00007f79a47f9d00 in ?? () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x00007f799d389700 in ?? ()
No symbol table info available.
#3  0x0000000000000000 in ?? ()
No symbol table info available.

Thread 4 (Thread 0x7f799e59b700 (LWP 4204)):
#0  0x00007f79a47fff4d in __lll_lock_wait () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f79a47fbd02 in _L_lock_791 () from /lib64/libpthread.so.0
No symbol table info available.
#2  0x00007f79a47fbc08 in pthread_mutex_lock () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00007f79a51e5873 in ldap_pvt_thread_mutex_lock (mutex=0x20ecbb0) at 
thr_posix.c:300
No locals.
#4  0x000000000043dbd8 in connection_operation (ctx=0x7f799e59abb0, 
arg_v=0x7f7990002670) at connection.c:1150
        rc = 0
        cancel = 32633
        op = 0x7f7990002670
        rs = {sr_type = REP_RESULT, sr_tag = 107, sr_msgid = 2, sr_err = 0, 
sr_matched = 0x0, sr_text = 0x0, sr_ref = 0x0, sr_ctrls = 0x0, sr_un = 
{sru_search = {r_entry = 0x0,
              r_attr_flags = 0, r_operational_attrs = 0x0, r_attrs = 0x0, 
r_nentries = 0, r_v2ref = 0x0}, sru_sasl = {r_sasldata = 0x0}, sru_extended 
= {r_rspoid = 0x0, r_rspdata = 0x0}},
          sr_flags = 0}
        tag = 74
        opidx = SLAP_OP_DELETE
        conn = 0x20ecb98
        memctx = 0x7f7990002ba0
        memctx_null = 0x0
        memsiz = 1048576
        __PRETTY_FUNCTION__ = "connection_operation"
#5  0x000000000043e0d0 in connection_read_thread (ctx=0x7f799e59abb0, 
argv=0x9) at connection.c:1283
        rc = 0
        cri = {op = 0x7f7990002670, func = 0x0, arg = 0x0, ctx = 
0x7f799e59abb0, nullop = 0}
        s = 9
#6  0x00007f79a51e416a in ldap_int_thread_pool_wrapper (xpool=0x2083dc0) at 
tpool.c:956
        pq = 0x2083dc0
        pool = 0x2083cb0
        task = 0x7f79980008c0
        work_list = 0x2083e30
        ctx = {ltu_pq = 0x2083dc0, ltu_id = 140160324450048, ltu_key = 
{{ltk_key = 0x43d647 <conn_counter_init>, ltk_data = 0x7f7990002a90, 
ltk_free = 0x43d499 <conn_counter_destroy>}, {
              ltk_key = 0x4b206b <slap_sl_mem_init>, ltk_data = 
0x7f7990002ba0, ltk_free = 0x4b1e90 <slap_sl_mem_destroy>}, {ltk_key = 
0x4587e9 <slap_op_free>, ltk_data = 0x0,
              ltk_free = 0x45873c <slap_op_q_destroy>}, {ltk_key = 0x0, 
ltk_data = 0x7f79901088a0, ltk_free = 0x0}, {ltk_key = 0x0, ltk_data = 0x0, 
ltk_free = 0x0} <repeats 28 times>}}
        kctx = 0x0
        i = 32
        keyslot = 486
        hash = 3071177190
        pool_lock = 0
        freeme = 0
        __PRETTY_FUNCTION__ = "ldap_int_thread_pool_wrapper"
#7  0x00007f79a47f9dc5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#8  0x00007f79a35daced in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 3 (Thread 0x7f79a564c740 (LWP 4174)):
#0  0x00007f79a47faef7 in pthread_join () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00007f79a51e5761 in ldap_pvt_thread_join (thread=140160332842752, 
thread_return=0x0) at thr_posix.c:201
No locals.
#2  0x000000000043aa65 in slapd_daemon () at daemon.c:2910
        i = 0
        rc = 0
#3  0x0000000000415be4 in main (argc=8, argv=0x7fff32256928) at main.c:1018
        i = -1
        no_detach = 1
        rc = -12
        urls = 0x204e0b0 "ldap://localhost:9011/"
        username = 0x0
        groupname = 0x0
        sandbox = 0x0
        syslogUser = 160
        pid = 32633
        waitfds = {841312160, 32767}
        g_argc = 8
        g_argv = 0x7fff32256928
        configfile = 0x0
        configdir = 0x204e090 "./slapd.d"
        serverName = 0x7fff322584cb "lt-slapd"
        serverMode = 1
        scp = 0x0
        scp_entry = 0x0
        debug_unknowns = 0x0
        syslog_unknowns = 0x0
        serverNamePrefix = 0x4fd618 ""
        l = 140734034700584
        slapd_pid_file_unlink = 0
        slapd_args_file_unlink = 0
        firstopt = 0
        __PRETTY_FUNCTION__ = "main"

Thread 2 (Thread 0x7f799ed9c700 (LWP 4189)):
#0  0x00007f79a35db2c3 in epoll_wait () from /lib64/libc.so.6
No symbol table info available.
#1  0x00000000004398a6 in slapd_daemon_task (ptr=0x22acd80) at daemon.c:2517
        ns = 1
        at = 0
        nfds = 4
        revents = 0x2057db0
        tvp = 0x0
        cat = {tv_sec = 0, tv_usec = 0}
        i = 1
        nwriters = 0
        now = 1481240987
        tv = {tv_sec = 0, tv_usec = 0}
        tdelta = 1
        rtask = 0x0
        l = 1
        last_idle_check = 1481240984
        ebadf = 0
        tid = 0
#2  0x00007f79a47f9dc5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#3  0x00007f79a35daced in clone () from /lib64/libc.so.6
No symbol table info available.

Thread 1 (Thread 0x7f799db8a700 (LWP 4264)):
#0  0x00007f799db90b7b in syncprov_op_abandon (op=0x7f799db898c0, 
rs=0x7f799db89700) at syncprov.c:1154
        on = 0x7f7990112790
        si = 0x7f7990111dd0
        so = 0x20
        sop = 0x7f7990002270
#1  0x000000000046765b in fe_op_abandon (op=0x7f799db898c0, 
rs=0x7f799db89700) at abandon.c:136
No locals.
#2  0x000000000043cb11 in connection_abandon (c=0x20ecb98) at 
connection.c:732
        rs = {sr_type = REP_RESULT, sr_tag = 0, sr_msgid = 0, sr_err = 0, 
sr_matched = 0x0, sr_text = 0x0, sr_ref = 0x0, sr_ctrls = 0x0, sr_un = 
{sru_search = {r_entry = 0x0,
              r_attr_flags = 0, r_operational_attrs = 0x0, r_attrs = 0x0, 
r_nentries = 0, r_v2ref = 0x0}, sru_sasl = {r_sasldata = 0x0}, sru_extended 
= {r_rspoid = 0x0, r_rspdata = 0x0}},
          sr_flags = 0}
        o = 0x7f7990002670
        next = 0x7f7994100930
        op = {o_hdr = 0x7f799db89770, o_tag = 80, o_time = 0, o_tincr = 0, 
o_bd = 0x20a7870, o_req_dn = {bv_len = 0, bv_val = 0x0}, o_req_ndn = 
{bv_len = 0, bv_val = 0x0}, o_request = {
            oq_add = {rs_modlist = 0x2, rs_e = 0x0}, oq_bind = {rb_method = 
2, rb_cred = {bv_len = 0, bv_val = 0x0}, rb_edn = {bv_len = 0, bv_val = 
0x0}, rb_ssf = 0, rb_mech = {
                bv_len = 0, bv_val = 0x0}}, oq_compare = {rs_ava = 0x2}, 
oq_modify = {rs_mods = {rs_modlist = 0x2, rs_no_opattrs = 0 '\000'}, 
rs_increment = 0}, oq_modrdn = {rs_mods = {
                rs_modlist = 0x2, rs_no_opattrs = 0 '\000'}, 
rs_deleteoldrdn = 0, rs_newrdn = {bv_len = 0, bv_val = 0x0}, rs_nnewrdn = 
{bv_len = 0, bv_val = 0x0}, rs_newSup = 0x0,
              rs_nnewSup = 0x0}, oq_search = {rs_scope = 2, rs_deref = 0, 
rs_slimit = 0, rs_tlimit = 0, rs_limit = 0x0, rs_attrsonly = 0, rs_attrs = 
0x0, rs_filter = 0x0, rs_filterstr = {
                bv_len = 0, bv_val = 0x0}}, oq_abandon = {rs_msgid = 2}, 
oq_cancel = {rs_msgid = 2}, oq_extended = {rs_reqoid = {bv_len = 2, bv_val 
= 0x0}, rs_flags = 0,
              rs_reqdata = 0x0}, oq_pwdexop = {rs_extended = {rs_reqoid = 
{bv_len = 2, bv_val = 0x0}, rs_flags = 0, rs_reqdata = 0x0}, rs_old = 
{bv_len = 0, bv_val = 0x0}, rs_new = {
                bv_len = 0, bv_val = 0x0}, rs_mods = 0x0, rs_modtail = 
0x0}}, o_abandon = 0, o_cancel = 0, o_groups = 0x0, o_do_not_cache = 0 
'\000', o_is_auth_check = 0 '\000',
          o_dont_replicate = 0 '\000', o_acl_priv = ACL_NONE, o_nocaching = 
0 '\000', o_delete_glue_parent = 0 '\000', o_no_schema_check = 0 '\000', 
o_no_subordinate_glue = 0 '\000',
          o_ctrlflag = '\000' <repeats 31 times>, o_controls = 0x0, o_authz 
= {sai_method = 0, sai_mech = {bv_len = 0, bv_val = 0x0}, sai_dn = {bv_len 
= 0, bv_val = 0x0}, sai_ndn = {
              bv_len = 0, bv_val = 0x0}, sai_ssf = 0, sai_transport_ssf = 
0, sai_tls_ssf = 0, sai_sasl_ssf = 0}, o_ber = 0x0, o_res_ber = 0x0, 
o_callback = 0x0, o_ctrls = 0x0, o_csn = {
            bv_len = 0, bv_val = 0x0}, o_private = 0x0, o_extra = 
{slh_first = 0x0}, o_next = {stqe_next = 0x0}}
        ohdr = {oh_opid = 0, oh_connid = 1004, oh_conn = 0x20ecb98, 
oh_msgid = 0, oh_protocol = 0, oh_tid = 0, oh_threadctx = 0x0, oh_tmpmemctx 
= 0x0, oh_tmpmfuncs = 0x0,
          oh_counters = 0x0, oh_log_prefix = '\000' <repeats 255 times>}
#3  0x000000000043cebd in connection_closing (c=0x20ecb98, why=0x505940 
<conn_lost_str> "connection lost") at connection.c:802
        __PRETTY_FUNCTION__ = "connection_closing"
#4  0x000000000043ea0f in connection_read (s=9, cri=0x7f799db89b60) at 
connection.c:1472
        rc = -2
        c = 0x20ecb98
        __PRETTY_FUNCTION__ = "connection_read"
#5  0x000000000043e031 in connection_read_thread (ctx=0x7f799db89bb0, 
argv=0x9) at connection.c:1276
        rc = 0
        cri = {op = 0x7f7994100930, func = 0x0, arg = 0x0, ctx = 
0x7f799db89bb0, nullop = 0}
        s = 9
#6  0x00007f79a51e416a in ldap_int_thread_pool_wrapper (xpool=0x2083dc0) at 
tpool.c:956
        pq = 0x2083dc0
        pool = 0x2083cb0
        task = 0x7f7998000b40
        work_list = 0x2083e30
        ctx = {ltu_pq = 0x2083dc0, ltu_id = 140160313894656, ltu_key = 
{{ltk_key = 0x4b206b <slap_sl_mem_init>, ltk_data = 0x7f79940008c0, 
ltk_free = 0x4b1e90 <slap_sl_mem_destroy>}, {
              ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0} <repeats 31 
times>}}
        kctx = 0x0
        i = 32
        keyslot = 84
        hash = 4081317972
        pool_lock = 0
        freeme = 0
        __PRETTY_FUNCTION__ = "ldap_int_thread_pool_wrapper"
#7  0x00007f79a47f9dc5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#8  0x00007f79a35daced in clone () from /lib64/libc.so.6
No symbol table info available.



--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 3 Quanah Gibson-Mount 2020-03-22 23:43:48 UTC
Note: need to see if I can reproduce this
Comment 4 Quanah Gibson-Mount 2021-01-11 17:44:20 UTC
likely fixed by recent code changes, will confirm.
Comment 5 Quanah Gibson-Mount 2021-01-15 19:54:44 UTC
Reproduced in 36 runs:

Cleaning up test run directory from this run.
Running 36 of 10000 iterations
running defines.sh
Starting slapd on TCP/IP port 9011... /home/build/git/symas-packages/thirdparty/openldap/build/UBUNTU18_64/symas-openldap/tests
Using ldapsearch to check that slapd is running...
Inserting syncprov overlay ...
Starting a refreshAndPersist search in background
Removing syncprov overlay again ...
Waiting 2 seconds for RefreshAndPersist search to end ...
Checking return code of backgrounded RefreshAndPersist search ...
Exit code correct.
Running a refreshOnly search, should fail...
ldapsearch should have failed with Critical extension is unavailable (12)!
./scripts/test062-config-delete: 164: kill: No such process

Failed after 36 of 10000 iterations
Comment 6 Howard Chu 2021-01-31 15:23:57 UTC
fix in master
Comment 7 Quanah Gibson-Mount 2021-02-01 17:13:55 UTC
commit 0da38889e1c224e310d3039db7791fcc69fee6e5
Author: Howard Chu <hyc@openldap.org>
Date:   Sun Jan 31 15:21:55 2021 +0000

    ITS#8541 fix data race in syncprov removal
Comment 8 Quanah Gibson-Mount 2021-02-02 23:22:46 UTC
test051 is now crashing
Comment 9 Quanah Gibson-Mount 2021-02-02 23:23:26 UTC
err, test050. I'll see if I can reproduce

5584Using ldapsearch to read config from server 3...
5585ldapsearch failed at server 3 (255)!
5586./scripts/test050-syncrepl-multiprovider: 355: kill: No such process
5587>>>>> test050-syncrepl-multiprovider failed for mdb after 21 seconds
5588(exit 255)
5589make[2]: *** [Makefile:298: mdb-mod] Error 255
5590make[2]: Leaving directory '/builds/openldap/openldap/tests'
5591make[1]: *** [Makefile:284: test] Error 2
5592make[1]: Leaving directory '/builds/openldap/openldap/tests'
5593make: *** [Makefile:296: test] Error 2