Issue 8173 - back-ldap slapd crash segfault
Summary: back-ldap slapd crash segfault
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.40
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-06-18 16:51 UTC by adrian.raemy@vtg.admin.ch
Modified: 2015-11-30 18:20 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description adrian.raemy@vtg.admin.ch 2015-06-18 16:51:31 UTC
Full_Name: Adrian Raemy
Version: 2.4.40
OS: SLES 11 SP3
URL: 
Submission from: (NULL) (193.5.216.100)


Hi,

The Openldap backend "ldap" slapd crashing under heavy load.
Tested with 2.4.26 and also 2.4.40.
We get an segfault:
Jun 18 00:00:36 serverxx1 kernel: [9079587.309374] slapd[6193]: segfault at 40
ip 00007f184cf6b2c6 sp 00007f1838ff85c0 error 4 in slapd[7f184cdda000+26c000]
Jun 18 09:40:12 serverxx1 kernel: [9114294.985888] slapd[10309]: segfault at 40
ip 00007f3b31f492c6 sp 00007f3b1f7f55c0 error 4 in slapd[7f3b31db8000+26c000]
Jun 18 12:38:29 serverxx1 kernel: [9124971.271300] slapd[15868]: segfault  4 40
ip 00007fc1555512c6 sp 00007fc141ffa5c0 error 4 in slapd[7fc1553c0000+26c000]

OpenLdap Proxy Server has:
Type = VM Guest
mem = 4GB
CPU = 2
Cores = 2

We don't get any other errors or warning before, CPU load and MEM etc all is
green and OK. Also files open etc is set unlimited. 

Scenario: Linux/ Unix Clients and Applications via ssl start_tls and ssl over
Openldap Proxy using on the Proxy the backend "ldap" to proxing to Openldap
Master.
When the Proxy gets heavy load due to activity from clients and Applications,
the slapd crashes unexpected with the above slapd segfault.
This is only happen with the backend-ldap. The Openldap Master doesn't crash and
is stable at all. So it must be the backend ldap which has the problem.
We perform Search-, Modify and add/remove Operations. So the entire funtionality
will be used.

Only with lot of searches sent over proxy to master we couldn't reproduce the
segfault. It needs realy a heavy load generated (read/itite- modify operations
at all) i guess to reproduce this. We don't have this possibility in our very
restricted environment.

We can't provide a core dump of the slapd due to confidentially restrictions.

We would test with the backend "meta" but this backend doesn't support the
"extended operations" like "LDAP Password Modify Extended Operation". So we have
only two possibilities, the one that ldap backend will be fixed or the extended
operations at meta backend will be added.

Please, we need help/ a fix asap, e e authentication at all is compromised and 
the slapd crashes often on the Openldap Proxy.

Thanks for your help
Comment 1 Howard Chu 2015-06-18 19:40:00 UTC
adrian.raemy@vtg.admin.ch wrote:
> Full_Name: Adrian Raemy
> Version: 2.4.40
> OS: SLES 11 SP3
> URL:
> Submission from: (NULL) (193.5.216.100)
>
>
> Hi,
>
> The Openldap backend "ldap" slapd crashing under heavy load.
> Tested with 2.4.26 and also 2.4.40.
> We get an segfault:
> Jun 18 00:00:36 serverxx1 kernel: [9079587.309374] slapd[6193]: segfault at 40
> ip 00007f184cf6b2c6 sp 00007f1838ff85c0 error 4 in slapd[7f184cdda000+26c000]
> Jun 18 09:40:12 serverxx1 kernel: [9114294.985888] slapd[10309]: segfault at 40
> ip 00007f3b31f492c6 sp 00007f3b1f7f55c0 error 4 in slapd[7f3b31db8000+26c000]
> Jun 18 12:38:29 serverxx1 kernel: [9124971.271300] slapd[15868]: segfault  4 40
> ip 00007fc1555512c6 sp 00007fc141ffa5c0 error 4 in slapd[7fc1553c0000+26c000]

> We can't provide a core dump of the slapd due to confidentially restrictions.

Then we cannot help you.

I suggest you test with the 2.4.41 release candidate and see if the same crash 
still occurs.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 adrian.raemy@vtg.admin.ch 2015-06-24 13:06:32 UTC
Dear Howard,

I did a core dump when the crash occurred at slapd respectively when the back-ldap crashed.
Maybe the output of the gdb can help you find out where the problem is.
Please let me know what you need more.

Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldaps://0.0.0.0:636  -f /etc/op'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fc0627492c6 in ldap_back_bind ()
(gdb) where
#0  0x00007fc0627492c6 in ldap_back_bind ()
#1  0x00007fc062639fcb in fe_op_bind ()
#2  0x00007fc0626a86fd in overlay_op_walk ()
#3  0x00007fc0626a8956 in over_op_func ()
#4  0x00007fc0626a89de in over_op_bind ()
#5  0x00007fc062639697 in do_bind ()
#6  0x00007fc06260ff19 in connection_operation ()
#7  0x00007fc0626104e1 in connection_read_thread ()
#8  0x00007fc062145c3c in ldap_int_thread_pool_wrapper () from /usr/lib64/libldap_r-2.4.so.2
#9  0x00007fc06155e806 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fc05fda902d in clone () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()

Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldaps://0.0.0.0:636  -f /etc/op'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fc0627492c6 in ldap_back_bind ()
(gdb) where
#0  0x00007fc0627492c6 in ldap_back_bind ()
#1  0x00007fc062639fcb in fe_op_bind ()
#2  0x00007fc0626a86fd in overlay_op_walk ()
#3  0x00007fc0626a8956 in ?? ()
#4  0x00007fc0626a89de in ?? ()
#5  0x00007fc062639697 in do_bind ()
#6  0x00007fc06260ff19 in ?? ()
#7  0x00007fc0626104e1 in ?? ()
#8  0x00007fc062145c3c in ldap_int_thread_pool_wrapper () from /usr/lib64/libldap_r-2.4.so.2
#9  0x00007fc06155e806 in start_thread () from /lib64/libpthread.so.0
#10 0x00007fc05fda902d in gnu_dev_makedev () from /lib64/libc.so.6
#11 0x0000000000000000 in ?? ()

Best regards
Adrian
Comment 3 Howard Chu 2015-06-24 23:19:03 UTC
Adrian.Raemy@vtg.admin.ch wrote:
> --_000_BE8E19527611BA409D68FF6EA186AF9002A24849BEREXMBX19ifc1i_
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> Dear Howard,
>
> I did a core dump when the crash occurred at slapd respectively when the ba=
> ck-ldap crashed.
> Maybe the output of the gdb can help you find out where the problem is.
> Please let me know what you need more.

This trace seems to lack debug symbols. Without source line numbers etc. 
there's nothing we can determine from this trace. Please read the FAQ.

http://www.openldap.org/faq/index.cgi?file=59

The trace shows that you probably have some overlays configured. What 
overlays? You need to provide your slapd config files if you want any help here.

Volunteers aren't going to go out of their way to help you if you aren't going 
to provide the needed info.

> Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldap=
> s://0.0.0.0:636  -f /etc/op'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007fc0627492c6 in ldap_back_bind ()
> (gdb) where
> #0  0x00007fc0627492c6 in ldap_back_bind ()
> #1  0x00007fc062639fcb in fe_op_bind ()
> #2  0x00007fc0626a86fd in overlay_op_walk ()
> #3  0x00007fc0626a8956 in over_op_func ()
> #4  0x00007fc0626a89de in over_op_bind ()
> #5  0x00007fc062639697 in do_bind ()
> #6  0x00007fc06260ff19 in connection_operation ()
> #7  0x00007fc0626104e1 in connection_read_thread ()
> #8  0x00007fc062145c3c in ldap_int_thread_pool_wrapper () from /usr/lib64/l=
> ibldap_r-2.4.so.2
> #9  0x00007fc06155e806 in start_thread () from /lib64/libpthread.so.0
> #10 0x00007fc05fda902d in clone () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()
>
> Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldap=
> s://0.0.0.0:636  -f /etc/op'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007fc0627492c6 in ldap_back_bind ()
> (gdb) where
> #0  0x00007fc0627492c6 in ldap_back_bind ()
> #1  0x00007fc062639fcb in fe_op_bind ()
> #2  0x00007fc0626a86fd in overlay_op_walk ()
> #3  0x00007fc0626a8956 in ?? ()
> #4  0x00007fc0626a89de in ?? ()
> #5  0x00007fc062639697 in do_bind ()
> #6  0x00007fc06260ff19 in ?? ()
> #7  0x00007fc0626104e1 in ?? ()
> #8  0x00007fc062145c3c in ldap_int_thread_pool_wrapper () from /usr/lib64/l=
> ibldap_r-2.4.so.2
> #9  0x00007fc06155e806 in start_thread () from /lib64/libpthread.so.0
> #10 0x00007fc05fda902d in gnu_dev_makedev () from /lib64/libc.so.6
> #11 0x0000000000000000 in ?? ()
>
> Best regards
> Adrian

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 4 adrian.raemy@vtg.admin.ch 2015-07-06 09:26:19 UTC
Dear Howard,

below you will find the slapd.conf of the OpenLDAP Proxy and the slapd.conf of the OpenLDAP Master where you can see which overlays we are using.
The debug symbol core dump we will provide asap, we need first install the debug packages for that on one host.

OpenLDAP Proxy slapd.conf:

include          /etc/openldap/schema/core.schema
include          /etc/openldap/schema/cosine.schema
include          /etc/openldap/schema/inetorgperson.schema
include          /etc/openldap/schema/openldap.schema
include          /etc/openldap/schema/rfc2307bis.schema
include          /etc/openldap/schema/ppolicy.schema
include          /etc/openldap/schema/sudo.schema
include          /etc/openldap/schema/guacConfigGroup.schema

pidfile          /var/run/slapd/slapd.pid
argsfile         /var/run/slapd/slapd.args

modulepath    /usr/lib/openldap
moduleload    back_ldap.la
moduleload      auditlog
overlay         auditlog
auditlog        /var/lib/ldap/auditlog/ldap.auditlog

TLSCertificateFile     /etc/openldap/ssl.crt/server.crt
TLSCertificateKeyFile  /etc/openldap/ssl.key/server.key
TLSCACertificatePath   /etc/openldap/ssl.crt/
TLSCipherSuite         HIGH:MEDIUM:-SSLv2
TLSVerifyClient        allow

security ssf=112 update_ssf=112 tls=56

loglevel        stats none

sizelimit       unlimited

database         ldap

protocol-version    3
tls                    start
suffix              "dc=xxxx.xx"
uri                   "ldap://xxxx.xx.xxx.xx.xx:389/"
idassert-authzFrom  "*"

idle-timeout        1500

idletimeout         2700


And here the OpenLDAP Master slapd.conf

include          /etc/openldap/schema/core.schema
include          /etc/openldap/schema/cosine.schema
include          /etc/openldap/schema/inetorgperson.schema
include          /etc/openldap/schema/openldap.schema
include          /etc/openldap/schema/rfc2307bis.schema
include          /etc/openldap/schema/ppolicy.schema
include          /etc/openldap/schema/sudo.schema

pidfile          /var/run/slapd/slapd.pid
argsfile         /var/run/slapd/slapd.args

modulepath       /usr/lib/openldap/modules

TLSCertificateFile     /etc/openldap/ssl.crt/server.crt
TLSCertificateKeyFile  /etc/openldap/ssl.key/server.key
TLSCACertificatePath   /etc/openldap/ssl.crt/
TLSCipherSuite         HIGH:MEDIUM:-SSLv2
TLSVerifyClient        allow

security ssf=112 update_ssf=112 tls=56

password-hash {SHA}

loglevel        stats sync none

include         /etc/openldap/slapd.access

sizelimit       unlimited

database         hdb

readonly         off
suffix           "dc=xxx.xx"
rootdn           "cn=Manager,dc=xxx.xx"
rootpw           {SSHA}xxxxxxxxxx
directory        /var/lib/ldap/
checkpoint       1024 5
cachesize        100000
idlcachesize     100000

index objectClass           eq
index cn                    pres,sub,eq
index sn                    pres,sub,eq
index uid                   eq
index uidNumber             pres,eq
index gidNumber             pres,eq
index uniqueMember          pres,eq
index memberOf              pres,eq
index sudoUser              pres,eq,sub
index entryCSN,entryUUID    eq
index mail                  pres,eq,sub
index userClass             pres,eq
index ipHostNumber          eq

overlay unique
unique_uri ldap:///?uid?sub

overlay             ppolicy
ppolicy_default     "cn=xxxx,ou=xxxxx,dc=xxxx,dc=xxxx.xx"
ppolicy_use_lockout

overlay             memberof
memberof-group-oc   groupOfUniqueNames
memberof-member-ad  uniqueMember
memberof-refint     true
memberof-dn         cn=MemberOfOverlay,dc=xxx.xx

overlay             auditlog
auditlog            /var/lib/ldap/auditlog/ldap.auditlog

database            monitor

best Regards
Adrian
Comment 5 adrian.raemy@vtg.admin.ch 2015-07-09 11:49:08 UTC
Dear Howard,

I could now install the missing debug pakages and below you find the output of gdb.
I did check 6 Dumps where the slapd crashed and always I had the same output.
Now you can see where it crashes.

Let me know if you need more...hope you can localize the problem and give as a fix.

Regards
Adrian

Reading symbols from /usr/lib/openldap/slapd...Reading symbols from /usr/lib/debug/usr/lib/slapd.debug...done.
done.
[New LWP 11777]
[New LWP 11778]
[New LWP 11823]
[New LWP 11821]
[New LWP 11970]
[New LWP 11826]
[New LWP 11824]
[New LWP 11775]
[New LWP 11971]
[New LWP 11776]
[New LWP 11785]
[New LWP 11820]
[New LWP 11825]
[New LWP 11822]
[New LWP 11827]
[New LWP 11828]
[New LWP 11972]
[New LWP 11788]

warning: Could not load shared library symbols for stics.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldaps://0.0.0.0:636  -f /etc/op'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fcbb60572c6 in ldap_back_bind ()
(gdb) where
#0  0x00007fcbb60572c6 in ldap_back_bind ()
#1  0x00007fcbb5f47fcb in fe_op_bind ()
#2  0x00007fcbb5fb66fd in overlay_op_walk ()
#3  0x00007fcbb5fb6956 in over_op_func ()
#4  0x00007fcbb5fb69de in over_op_bind ()
#5  0x00007fcbb5f47697 in do_bind ()
#6  0x00007fcbb5f1df19 in connection_operation ()
#7  0x00007fcbb5f1e4e1 in connection_read_thread ()
#8  0x00007fcbb5a53c3c in ldap_int_thread_pool_wrapper () from /usr/lib64/libldap_r-2.4.so.2
#9  0x00007fcbb4e6c806 in start_thread (arg=<optimized out>) at pthread_create.c:301
#10 0x00007fcbb36b702d in gnu_dev_makedev (major=3066092544, minor=<optimized out>) at ../sysdeps/unix/sysv/linux/makedev.c:37
#11 0x0000000000000000 in ?? ()
(gdb)


Comment 6 adrian.raemy@vtg.admin.ch 2015-07-10 07:12:26 UTC
Here the full backtrace

#0  0x00007f756ab792c6 in ldap_back_bind ()
(gdb) bt full
#0  0x00007f756ab792c6 in ldap_back_bind ()
No symbol table info available.
#1  0x00007f756aa69fcb in fe_op_bind ()
No symbol table info available.
#2  0x00007f756aad86fd in overlay_op_walk ()
No symbol table info available.
#3  0x00007f756aad8956 in over_op_func ()
No symbol table info available.
#4  0x00007f756aad89de in over_op_bind ()
No symbol table info available.
#5  0x00007f756aa69697 in do_bind ()
No symbol table info available.
#6  0x00007f756aa3ff19 in connection_operation ()
No symbol table info available.
#7  0x00007f756aa404e1 in connection_read_thread ()
No symbol table info available.
#8  0x00007f756a575c3c in ldap_int_thread_pool_wrapper () from /usr/lib64/libldap_r-2.4.so.2
No symbol table info available.
#9  0x00007f756998e806 in start_thread (arg=<optimized out>) at pthread_create.c:301
        __res = <optimized out>
        pd = 0x7f75527fc700
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140141872006912, 3828700323369096852, 140142144269328, 140141872005120, 0, 8388608, -3762508621752658284, -3762593002521979244},
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        robust = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#10 0x00007f75681d902d in gnu_dev_makedev (major=1803453360, minor=<optimized out>) at ../sysdeps/unix/sysv/linux/makedev.c:37
No locals.
#11 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)

Comment 7 adrian.raemy@vtg.admin.ch 2015-07-15 15:52:26 UTC
Dear Howard,

Below the bt full with Openldap 2.4.40 built with "-g" flag.

Let me know if you need more...hope you have a solution for the problem because it is really a Problem for us..
slapd of backend ldap crashes 4-5 times per day.

Regards
Adrian

warning: Could not load shared library symbols for stics.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldaps://0.0.0.0:636  -f /etc/op'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fcbb60572c6 in ldap_back_bind (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30) at bind.c:319
319     bind.c: No such file or directory.
(gdb) bt full
#0  0x00007fcbb60572c6 in ldap_back_bind (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30) at bind.c:319
        li = 0x7fcbb6443e40
        lc = 0x0
        ctrls = 0x0
        save_o_dn = {bv_len = 0, bv_val = 0x0}
        save_o_do_not_cache = 0
        rc = 52
        msgid = 2
        retrying = LDAP_BACK_DONTSEND
        __PRETTY_FUNCTION__ = "ldap_back_bind"
#1  0x00007fcbb5f47fcb in fe_op_bind (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30) at bind.c:383
        bd = 0x7fcbad881750
#2  0x00007fcbb5fb66fd in overlay_op_walk (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30, which=op_bind, oi=0x7fcbb6443430, on=0x0) at backover.c:671
        func = 0x7fcbb6343ad8 <slap_frontendInfo+88>
        rc = 32768
#3  0x00007fcbb5fb6956 in over_op_func (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30, which=op_bind) at backover.c:723
        oi = 0x7fcbb6443430
        on = 0x7fcbb6443610
        be = 0x7fcbb6343c40 <slap_frontendDB>
        db = {bd_info = 0x7fcbb6343a80 <slap_frontendInfo>, bd_self = 0x7fcbb6343c40 <slap_frontendDB>, be_ctrls = "\000", '\001' <repeats 18 times>, '\000' <repeats 13 times>,
          be_flags = 768, be_restrictops = 0, be_requires = 0, be_ssf_set = {sss_ssf = 112, sss_transport = 0, sss_tls = 56, sss_sasl = 0, sss_update_ssf = 112, sss_update_transport = 0,
            sss_update_tls = 0, sss_update_sasl = 0, sss_simple_bind = 0}, be_suffix = 0x7fcbb643bd50, be_nsuffix = 0x7fcbb643bda0, be_schemadn = {bv_len = 12,
            bv_val = 0x7fcbb6474530 "cn=Subschema"}, be_schemandn = {bv_len = 12, bv_val = 0x7fcbb6474670 "cn=subschema"}, be_rootdn = {bv_len = 0, bv_val = 0x0}, be_rootndn = {bv_len = 0,
            bv_val = 0x0}, be_rootpw = {bv_len = 0, bv_val = 0x0}, be_max_deref_depth = 0, be_def_limit = {lms_t_soft = 3600, lms_t_hard = 0, lms_s_soft = -1, lms_s_hard = 0,
           lms_s_unchecked = -1, lms_s_pr = 0, lms_s_pr_hide = 0, lms_s_pr_total = 0}, be_limits = 0x0, be_acl = 0x0, be_dfltaccess = ACL_READ, be_extra_anlist = 0x0, be_update_ndn = {
            bv_len = 0, bv_val = 0x0}, be_update_refs = 0x0, be_pending_csn_list = 0x0, be_pcl_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0,
              __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, be_syncinfo = 0x0, be_pb = 0x0,
          be_cf_ocs = 0x7fcbb63355a0 <cf_ocs+448>, be_private = 0x0, be_next = {stqe_next = 0x7fcbb643fa00}}
        cb = {sc_next = 0x0, sc_response = 0x7fcbb5fb5629 <over_back_response>, sc_cleanup = 0x0, sc_writewait = 0x0, sc_private = 0x7fcbb6443430}
        sc = 0x7fcbb6c0dc38
        rc = 32768
        __PRETTY_FUNCTION__ = "over_op_func"
#4  0x00007fcbb5fb69de in over_op_bind (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30) at backover.c:738
No locals.
#5  0x00007fcbb5f47697 in do_bind (op=0x7fcbb6c0dc00, rs=0x7fcbad881a30) at bind.c:205
        ber = 0x7fcb8c1a9510
        version = 3
        method = 128
        mech = {bv_len = 0, bv_val = 0x0}
        dn = {bv_len = 41, bv_val = 0x7fcb88344a3a "cn=xxxxbnd,ou=bind,dc=xxxx,dc=xxxx.xx"}
        tag = 128
        be = 0x0
#6  0x00007fcbb5f1df19 in connection_operation (ctx=0x7fcbad881b80, arg_v=0x7fcbb6c0dc00) at connection.c:1155
        rc = 80
        cancel = 0
        op = 0x7fcbb6c0dc00
        rs = {sr_type = REP_RESULT, sr_tag = 97, sr_msgid = 2, sr_err = 52, sr_matched = 0x0, sr_text = 0x7fcbb60e698a "Start TLS failed", sr_ref = 0x0, sr_ctrls = 0x0, sr_un = {
            sru_search = {r_entry = 0x0, r_attr_flags = 0, r_operational_attrs = 0x0, r_attrs = 0x0, r_nentries = 0, r_v2ref = 0x0}, sru_sasl = {r_sasldata = 0x0}, sru_extended = {
              r_rspoid = 0x0, r_rspdata = 0x0}}, sr_flags = 0}
        tag = 96
        opidx = SLAP_OP_BIND
        conn = 0x7fcbae0f22d0
        memctx = 0x7fcbb64fada0
        memctx_null = 0x0
        memsiz = 1048576
        __PRETTY_FUNCTION__ = "connection_operation"
#7  0x00007fcbb5f1e4e1 in connection_read_thread (ctx=0x7fcbad881b80, argv=0x281) at connection.c:1291
        rc = 0
        cri = {op = 0x7fcbb6c0dc00, func = 0x0, arg = 0x0, ctx = 0x7fcbad881b80, nullop = 0}
        s = 641
#8  0x00007fcbb5a53c3c in ldap_int_thread_pool_wrapper (xpool=0x7fcbb640af70) at tpool.c:688
        pool = 0x7fcbb640af70
        task = 0x7fcba8787f80
        work_list = 0x7fcbb640b008
        ctx = {ltu_id = 140512766469888, ltu_key = {{ltk_key = 0x7fcbb5f1da15 <conn_counter_init>, ltk_data = 0x7fcbb64fac90, ltk_free = 0x7fcbb5f1d817 <conn_counter_destroy>}, {
              ltk_key = 0x7fcbb5f95b15 <slap_sl_mem_init>, ltk_data = 0x7fcbb64fada0, ltk_free = 0x7fcbb5f95944 <slap_sl_mem_destroy>}, {ltk_key = 0x7fcbb5f3991c <slap_op_free>,
              ltk_data = 0x7fcbb6c0ce30, ltk_free = 0x7fcbb5f39874 <slap_op_q_destroy>}, {ltk_key = 0x0, ltk_data = 0x0, ltk_free = 0x0} <repeats 29 times>}}
        kctx = 0x0
        i = 32
        keyslot = 838
        hash = 2545024838
        __PRETTY_FUNCTION__ = "ldap_int_thread_pool_wrapper"
#9  0x00007fcbb4e6c806 in start_thread (arg=<optimized out>) at pthread_create.c:301
        __res = <optimized out>
        pd = 0x7fcbad882700
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140512766469888, -3966806835471793915, 140512774851584, 140512766468096, 0, 8388608, 3991395584477806853, 3991451503570501893},
              mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        robust = <optimized out>
        freesize = <optimized out>
        __PRETTY_FUNCTION__ = "start_thread"
#10 0x00007fcbb36b702d in gnu_dev_makedev (major=3066092544, minor=<optimized out>) at ../sysdeps/unix/sysv/linux/makedev.c:37
No locals.
#11 0x0000000000000000 in ?? ()
No symbol table info available.
Comment 8 Howard Chu 2015-07-16 02:49:49 UTC
Adrian.Raemy@vtg.admin.ch wrote:
> --_000_BE8E19527611BA409D68FF6EA186AF9002A2A6AABEREXMBX19ifc1i_
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> Dear Howard,
>
> Below the bt full with Openldap 2.4.40 built with "-g" flag.

> Let me know if you need more...hope you have a solution for the problem bec=
> ause it is really a Problem for us..

If it was such a big problem, why did you not report it when you first 
encountered it in 2.4.26, which was released over 4 years ago?

You waited this long, clearly it's not that high a priority. That makes it 
even less of a priority for an open source project comprised of volunteers who 
have their own interests to pursue. Your repeating how big a problem you 
perceive it to be only comes across as whining, which makes folks even less 
interested in helping.

> slapd of backend ldap crashes 4-5 times per day.

Fixed now in git master. Please test and reply back with your results.

> Regards
> Adrian
>
> warning: Could not load shared library symbols for stics.
> Do you need "set solib-search-path" or "set sysroot"?
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `/usr/lib/openldap/slapd -h  ldap://0.0.0.0:389  ldap=
> s://0.0.0.0:636  -f /etc/op'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00007fcbb60572c6 in ldap_back_bind (op=3D0x7fcbb6c0dc00, rs=3D0x7fcba=
> d881a30) at bind.c:319
> 319     bind.c: No such file or directory.
> (gdb) bt full
> #0  0x00007fcbb60572c6 in ldap_back_bind (op=3D0x7fcbb6c0dc00, rs=3D0x7fcba=
> d881a30) at bind.c:319
>          li =3D 0x7fcbb6443e40
>          lc =3D 0x0
>          ctrls =3D 0x0
>          save_o_dn =3D {bv_len =3D 0, bv_val =3D 0x0}
>          save_o_do_not_cache =3D 0
>          rc =3D 52
>          msgid =3D 2
>          retrying =3D LDAP_BACK_DONTSEND
>          __PRETTY_FUNCTION__ =3D "ldap_back_bind"
> #1  0x00007fcbb5f47fcb in fe_op_bind (op=3D0x7fcbb6c0dc00, rs=3D0x7fcbad881=
> a30) at bind.c:383
>          bd =3D 0x7fcbad881750
> #2  0x00007fcbb5fb66fd in overlay_op_walk (op=3D0x7fcbb6c0dc00, rs=3D0x7fcb=
> ad881a30, which=3Dop_bind, oi=3D0x7fcbb6443430, on=3D0x0) at backover.c:671
>          func =3D 0x7fcbb6343ad8 <slap_frontendInfo+88>
>          rc =3D 32768
> #3  0x00007fcbb5fb6956 in over_op_func (op=3D0x7fcbb6c0dc00, rs=3D0x7fcbad8=
> 81a30, which=3Dop_bind) at backover.c:723
>          oi =3D 0x7fcbb6443430
>          on =3D 0x7fcbb6443610
>          be =3D 0x7fcbb6343c40 <slap_frontendDB>
>          db =3D {bd_info =3D 0x7fcbb6343a80 <slap_frontendInfo>, bd_self =3D=
>   0x7fcbb6343c40 <slap_frontendDB>, be_ctrls =3D "\000", '\001' <repeats 18 =
> times>, '\000' <repeats 13 times>,
>            be_flags =3D 768, be_restrictops =3D 0, be_requires =3D 0, be_ssf=
> _set =3D {sss_ssf =3D 112, sss_transport =3D 0, sss_tls =3D 56, sss_sasl =
> =3D 0, sss_update_ssf =3D 112, sss_update_transport =3D 0,
>              sss_update_tls =3D 0, sss_update_sasl =3D 0, sss_simple_bind =
> =3D 0}, be_suffix =3D 0x7fcbb643bd50, be_nsuffix =3D 0x7fcbb643bda0, be_sch=
> emadn =3D {bv_len =3D 12,
>              bv_val =3D 0x7fcbb6474530 "cn=3DSubschema"}, be_schemandn =3D {=
> bv_len =3D 12, bv_val =3D 0x7fcbb6474670 "cn=3Dsubschema"}, be_rootdn =3D {=
> bv_len =3D 0, bv_val =3D 0x0}, be_rootndn =3D {bv_len =3D 0,
>              bv_val =3D 0x0}, be_rootpw =3D {bv_len =3D 0, bv_val =3D 0x0}, =
> be_max_deref_depth =3D 0, be_def_limit =3D {lms_t_soft =3D 3600, lms_t_hard=
>   =3D 0, lms_s_soft =3D -1, lms_s_hard =3D 0,
>             lms_s_unchecked =3D -1, lms_s_pr =3D 0, lms_s_pr_hide =3D 0, lms=
> _s_pr_total =3D 0}, be_limits =3D 0x0, be_acl =3D 0x0, be_dfltaccess =3D AC=
> L_READ, be_extra_anlist =3D 0x0, be_update_ndn =3D {
>              bv_len =3D 0, bv_val =3D 0x0}, be_update_refs =3D 0x0, be_pendi=
> ng_csn_list =3D 0x0, be_pcl_mutex =3D {__data =3D {__lock =3D 0, __count =
> =3D 0, __owner =3D 0, __nusers =3D 0, __kind =3D 0,
>                __spins =3D 0, __list =3D {__prev =3D 0x0, __next =3D 0x0}}, =
> __size =3D '\000' <repeats 39 times>, __align =3D 0}, be_syncinfo =3D 0x0, =
> be_pb =3D 0x0,
>            be_cf_ocs =3D 0x7fcbb63355a0 <cf_ocs+448>, be_private =3D 0x0, be=
> _next =3D {stqe_next =3D 0x7fcbb643fa00}}
>          cb =3D {sc_next =3D 0x0, sc_response =3D 0x7fcbb5fb5629 <over_back_=
> response>, sc_cleanup =3D 0x0, sc_writewait =3D 0x0, sc_private =3D 0x7fcbb=
> 6443430}
>          sc =3D 0x7fcbb6c0dc38
>          rc =3D 32768
>          __PRETTY_FUNCTION__ =3D "over_op_func"
> #4  0x00007fcbb5fb69de in over_op_bind (op=3D0x7fcbb6c0dc00, rs=3D0x7fcbad8=
> 81a30) at backover.c:738
> No locals.
> #5  0x00007fcbb5f47697 in do_bind (op=3D0x7fcbb6c0dc00, rs=3D0x7fcbad881a30=
> ) at bind.c:205
>          ber =3D 0x7fcb8c1a9510
>          version =3D 3
>          method =3D 128
>          mech =3D {bv_len =3D 0, bv_val =3D 0x0}
>          dn =3D {bv_len =3D 41, bv_val =3D 0x7fcb88344a3a "cn=3Dxxxxbnd,ou=
> =3Dbind,dc=3Dxxxx,dc=3Dxxxx.xx"}
>          tag =3D 128
>          be =3D 0x0
> #6  0x00007fcbb5f1df19 in connection_operation (ctx=3D0x7fcbad881b80, arg_v=
> =3D0x7fcbb6c0dc00) at connection.c:1155
>          rc =3D 80
>          cancel =3D 0
>          op =3D 0x7fcbb6c0dc00
>          rs =3D {sr_type =3D REP_RESULT, sr_tag =3D 97, sr_msgid =3D 2, sr_e=
> rr =3D 52, sr_matched =3D 0x0, sr_text =3D 0x7fcbb60e698a "Start TLS failed=
> ", sr_ref =3D 0x0, sr_ctrls =3D 0x0, sr_un =3D {
>              sru_search =3D {r_entry =3D 0x0, r_attr_flags =3D 0, r_operatio=
> nal_attrs =3D 0x0, r_attrs =3D 0x0, r_nentries =3D 0, r_v2ref =3D 0x0}, sru=
> _sasl =3D {r_sasldata =3D 0x0}, sru_extended =3D {
>                r_rspoid =3D 0x0, r_rspdata =3D 0x0}}, sr_flags =3D 0}
>          tag =3D 96
>          opidx =3D SLAP_OP_BIND
>          conn =3D 0x7fcbae0f22d0
>          memctx =3D 0x7fcbb64fada0
>          memctx_null =3D 0x0
>          memsiz =3D 1048576
>          __PRETTY_FUNCTION__ =3D "connection_operation"
> #7  0x00007fcbb5f1e4e1 in connection_read_thread (ctx=3D0x7fcbad881b80, arg=
> v=3D0x281) at connection.c:1291
>          rc =3D 0
>          cri =3D {op =3D 0x7fcbb6c0dc00, func =3D 0x0, arg =3D 0x0, ctx =3D =
> 0x7fcbad881b80, nullop =3D 0}
>          s =3D 641
> #8  0x00007fcbb5a53c3c in ldap_int_thread_pool_wrapper (xpool=3D0x7fcbb640a=
> f70) at tpool.c:688
>          pool =3D 0x7fcbb640af70
>          task =3D 0x7fcba8787f80
>          work_list =3D 0x7fcbb640b008
>          ctx =3D {ltu_id =3D 140512766469888, ltu_key =3D {{ltk_key =3D 0x7f=
> cbb5f1da15 <conn_counter_init>, ltk_data =3D 0x7fcbb64fac90, ltk_free =3D 0=
> x7fcbb5f1d817 <conn_counter_destroy>}, {
>                ltk_key =3D 0x7fcbb5f95b15 <slap_sl_mem_init>, ltk_data =3D 0=
> x7fcbb64fada0, ltk_free =3D 0x7fcbb5f95944 <slap_sl_mem_destroy>}, {ltk_key=
>   =3D 0x7fcbb5f3991c <slap_op_free>,
>                ltk_data =3D 0x7fcbb6c0ce30, ltk_free =3D 0x7fcbb5f39874 <sla=
> p_op_q_destroy>}, {ltk_key =3D 0x0, ltk_data =3D 0x0, ltk_free =3D 0x0} <re=
> peats 29 times>}}
>          kctx =3D 0x0
>          i =3D 32
>          keyslot =3D 838
>          hash =3D 2545024838
>          __PRETTY_FUNCTION__ =3D "ldap_int_thread_pool_wrapper"
> #9  0x00007fcbb4e6c806 in start_thread (arg=3D<optimized out>) at pthread_c=
> reate.c:301
>          __res =3D <optimized out>
>          pd =3D 0x7fcbad882700
>          unwind_buf =3D {cancel_jmp_buf =3D {{jmp_buf =3D {140512766469888, =
> -3966806835471793915, 140512774851584, 140512766468096, 0, 8388608, 3991395=
> 584477806853, 3991451503570501893},
>                mask_was_saved =3D 0}}, priv =3D {pad =3D {0x0, 0x0, 0x0, 0x0=
> }, data =3D {prev =3D 0x0, cleanup =3D 0x0, canceltype =3D 0}}}
>          not_first_call =3D <optimized out>
>          robust =3D <optimized out>
>          freesize =3D <optimized out>
>          __PRETTY_FUNCTION__ =3D "start_thread"
> #10 0x00007fcbb36b702d in gnu_dev_makedev (major=3D3066092544, minor=3D<opt=
> imized out>) at ../sysdeps/unix/sysv/linux/makedev.c:37
> No locals.
> #11 0x0000000000000000 in ?? ()
> No symbol table info available.


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 9 Howard Chu 2015-07-16 02:50:47 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 10 adrian.raemy@vtg.admin.ch 2015-07-17 19:21:48 UTC
Dear Howard,

We didn't use back-ldap as proxy backend in the past, we had a very old openldap and did use syncrepl. 
Then we changed our IT Infrastructures and had to use ldap proxy. That was around 3 months ago.
So we tried with 2.4.26 coming from sles and there we had the problem..so we tried then with the latest release and with both it was the same.

Thanks a lot for your FIX.
Actually it looks like working fine, the slapd of the ldap proxy didn't crash anymore since 24 hours.
No segfaults at the moment.
We will monitor it next few days and weeks and if you like we can give you again a feedback.


ITS#8173 fix SEGV after failed retry
[openldap.git] / servers / slapd / back-ldap / bind.c

diff --git a/servers/slapd/back-ldap/bind.c b/servers/slapd/back-ldap/bind.c
index 598dae3..20197f3 100644 (file)

--- a/servers/slapd/back-ldap/bind.c
+++ b/servers/slapd/back-ldap/bind.c
@@ -271,6 +271,8 @@ retry:;               
		 if ( ldap_back_retry( &lc, op, rs, LDAP_BACK_BIND_SERR ) ) {                        
			goto retry;
		 }
+             	if ( !lc )
+                       	          	return( rc );        
	}         ldap_pvt_thread_mutex_lock( &li->li_counter_mutex );




Comment 11 adrian.raemy@vtg.admin.ch 2015-08-18 07:51:50 UTC
Dear Howard,

The fix runs stable now over weeks. 
slapd didn't crashed anymore. Thank you for the help.

I guess we can close the ticket.

Best Regards
Adrian 

Comment 12 Howard Chu 2015-08-20 13:36:12 UTC
Adrian.Raemy@vtg.admin.ch wrote:
> Dear Howard,
>
> The fix runs stable now over weeks.=20
> slapd didn't crashed anymore. Thank you for the help.
>
> I guess we can close the ticket.

Thanks for the followup.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 13 Quanah Gibson-Mount 2015-08-21 21:52:19 UTC
changed notes
changed state Test to Release
Comment 14 OpenLDAP project 2015-11-30 18:20:57 UTC
fixed in master
fixed in RE25
fixed in RE24 (2.4.43)
Comment 15 Quanah Gibson-Mount 2015-11-30 18:20:57 UTC
changed notes
changed state Release to Closed