Issue 9340 - Setting slapd.conf listener-threads > 1 causes Assertion failed: SLAP_SOCK_NOT_ACTIVE(id, s)
Summary: Setting slapd.conf listener-threads > 1 causes Assertion failed: SLAP_SOCK_NO...
Status: VERIFIED DUPLICATE of issue 7926
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.52
Hardware: Other Solaris
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-08 14:10 UTC by stacey.marshall
Modified: 2021-01-21 22:44 UTC (History)
0 users

See Also:


Attachments
Output from -d 5 option (43.78 KB, text/plain)
2020-09-08 14:10 UTC, stacey.marshall
Details

Note You need to log in before you can comment on or make changes to this issue.
Description stacey.marshall 2020-09-08 14:10:43 UTC
Created attachment 761 [details]
Output from -d 5 option

This is not a new issue, the same is observed on version 2.4.45 through 2.4.53

Setting listener-threads via slapd.conf file on Sun4v (SPARC T4-1)
system to a value other than 1 results in "Assertion failed:
SLAP_SOCK_NOT_ACTIVE(id, s), file daemon.c" when starting slapd.

The same does not occur when using Directory (-F) configuration.

    # grep '^listener-threads' /etc/openldap/slapd.conf
    listener-threads 8
    #  /usr/lib/slapd -u openldap -g openldap \
    >    -f /etc/openldap/slapd.conf   \
    >    -h 'ldap:/// ldapi:/// ldaps:///' -d 5 > slapd.error 2>&1
    Abort (core dumped)
    # tail slapd.error
    5f578024 >>> dnNormalize: <cn=Current>
    5f578024 <<< dnNormalize: <cn=current>
    5f578024 >>> dnNormalize: <cn=Uptime>
    5f578024 <<< dnNormalize: <cn=uptime>
    5f578024 >>> dnNormalize: <cn=Read>
    5f578024 <<< dnNormalize: <cn=read>
    5f578024 >>> dnNormalize: <cn=Write>
    5f578024 <<< dnNormalize: <cn=write>
    5f578024 slapd starting
    Assertion failed: SLAP_SOCK_NOT_ACTIVE(id, s), file daemon.c, line 903
    #  /usr/lib/slapd -VVV
    @(#) $OpenLDAP: slapd 2.4.53 (Sep  8 2020 03:15:21) $
            openldap
    
    Included static overlays:
        accesslog
        auditlog
        collect
        constraint
        dds
        deref
        dyngroup
        dynlist
        memberof
        ppolicy
        pcache
        refint
        retcode
        rwm
        seqmod
        sssvlv
        syncprov
        translucent
        unique
        valsort
    Included static backends:
        config
        ldif
        monitor
        ldap
        mdb
        meta
        null
        passwd
        relay
        shell
    #

When running under debugger the issue is not observed.

The core file reveals the following threads:

    # /opt/solarisstudio12.4/bin/dbx /usr/lib/slapd core
    For information about new features see `help changes'
    To remove this message, put `dbxenv suppress_startup_message 8.0' in your .dbxrc
    Reading slapd
    core file header read successfully
    Reading ld.so.1
    Reading libc.so.1
    Reading libldap_r-2.4.so.2.11.1
    Reading liblber-2.4.so.2.11.1
    Reading libsasl2.so.3.0.0
    Reading liblogin.so.3.0.0
    Reading libgssapiv2.so.3.0.0
    Reading libgssapi_krb5.so.2.2
    Reading libkrb5.so.3.3
    Reading libcom_err.so.3.0
    Reading libkrb5support.so.0.1
    Reading libk5crypto.so.3.1
    Reading libucrypto.so.1
    Reading libbsm.so.1
    Reading libtsol.so.2
    Reading libinetutil.so.1
    Reading libcryptoutil.so.1
    Reading libelf.so.1
    Reading libz.so.1
    Reading libresolv.so.2
    Reading libkwarn.so.1
    Reading libplain.so.3.0.0
    Reading libotp.so.3.0.0
    Reading libcrypto.so.1.0.0
    Reading libsasldb.so.3.0.0
    Reading libdb-5.3.so
    Reading libscram.so.3.0.0
    Reading libltdl.so.7.3.1
    Reading libssl.so.1.0.0
    Reading libuuid.so.1
    t@2 (l@2) terminated by signal ABRT (Abort)
    0x0007fdd05cce17a8: __lwp_sigqueue+0x0008:      bcc,a,pt  %icc,__lwp_sigqueue+0x18      ! 0x7fdd05cce17b8
    (dbx) lwps
      l@1 LWP suspended in uucopy()
    o>l@2 signal SIGABRT in __lwp_sigqueue()
      l@3 LWP suspended in __pollsys()
      l@4 LWP suspended in __pollsys()
      l@5 LWP suspended in __pollsys()
    (dbx) where
    current thread: t@2
    =>[1] __lwp_sigqueue(0x0, 0xa00386a7d98, 0x6, 0x0, 0xffffffffffffffff, 0x0), at 0x7fdd05cce17a8
      [2] raise(0x6, 0x7fdd0583ff060, 0x5, 0x5, 0x0, 0x0), at 0x7fdd05cc2af0c
      [3] abort(0x1, 0x1210, 0x0, 0x1000, 0x7fdd05ce3c278, 0x1a278), at 0x7fdd05cbfa144
      [4] _assert(0x10005d7a8, 0x10005d7c8, 0x387, 0x9, 0x8200, 0x7fdd05ce22000), at 0x7fdd05cbfb158
      [5] slapd_daemon_task(0xc4, 0xffffffffffcdb7c8, 0x100382000, 0x324800, 0x61bf4f2270, 0x100406d3c), at 0x1000c118c
    (dbx) lwp l@1
    t@1 (l@1) stopped in uucopy at 0x7fdd05cce1d34
    0x0007fdd05cce1d34: uucopy+0x0008:      blu      __cerror       ! 0x7fdd05cbdce80
    (dbx) where
    =>[1] uucopy(0x0, 0x7fdd0553fff10, 0xb0, 0xfffffffffffffff8, 0x0, 0xfffffe59ec25a990), at 0x7fdd05cce1d34
      [2] setup_top_frame(0x7fdd054c00000, 0x7fffc0, 0x7fdd0553fffc0, 0x7fdd0553fffc0, 0x0, 0xfffffe59ec25adf0), at 0x7fdd05ccdc494
      [3] setup_context(0xfffffe59ec25ab10, 0x7fdd05ccdc538, 0x7fdd058492240, 0x7fdd054c00000, 0x7fffc0, 0x1), at 0x7fdd05ccdc4dc
      [4] _thrp_create(0x0, 0x800000, 0x1000becb0, 0x61bfa6bed0, 0x80, 0xfffffe59ec25aeb0), at 0x7fdd05ccd7f64
      [5] pthread_create(0x61bfa6bed0, 0xfffffe59ec25af78, 0x1000becb0, 0x61bfa6bed0, 0x0, 0x0), at 0x7fdd05ccc80e4
      [6] ldap_pvt_thread_create(0x61bfa6bed0, 0x0, 0x1000becb0, 0x61bfa6bed0, 0x0, 0x61bfa6bec0), at 0x7fdd05c81b324
      [7] slapd_daemon(0x100406db8, 0x4, 0x10042710c, 0x100389490, 0x100406d38, 0x100382000), at 0x1000c1404
      [8] main(0x8800, 0x10038aa88, 0x1, 0x100386038, 0x1, 0x10038aa84), at 0x10009ec34
    (dbx) lwp l@3
    t@3 (l@3) stopped in __pollsys at 0x7fdd05cce13fc
    0x0007fdd05cce13fc: __pollsys+0x0008:   blu      __cerror       ! 0x7fdd05cbdce80
    (dbx) where
    =>[1] __pollsys(0x4, 0x2, 0x0, 0x0, 0x0, 0x0), at 0x7fdd05cce13fc
      [2] _pollsys(0x7fdd0577ff2a0, 0x2, 0x0, 0x0, 0x0, 0x10), at 0x7fdd05cccd6a8
      [3] pselect(0x10, 0x7fdd0577ffc30, 0x7fdd0577ff2a0, 0x7fdd05ce24c78, 0x0, 0x0), at 0x7fdd05cc2cb2c
      [4] select(0x10, 0x7fdd0577ffc30, 0x0, 0x0, 0x0, 0x5), at 0x7fdd05cc2ced0
      [5] slapd_daemon_task(0x61bf4e9890, 0x0, 0x100382000, 0x1, 0x1004010c8, 0x5), at 0x1000c014c
    (dbx) lwp l@4
    t@4 (l@4) stopped in __pollsys at 0x7fdd05cce13fc
    0x0007fdd05cce13fc: __pollsys+0x0008:   blu      __cerror       ! 0x7fdd05cbdce80
    (dbx) where
    =>[1] __pollsys(0x4, 0x2, 0x0, 0x0, 0x0, 0x0), at 0x7fdd05cce13fc
      [2] _pollsys(0x7fdd056bff370, 0x2, 0x0, 0x0, 0x0, 0x10), at 0x7fdd05cccd6a8
      [3] pselect(0x12, 0x7fdd056bffd10, 0x7fdd056bff370, 0x7fdd05ce24c78, 0x0, 0x0), at 0x7fdd05cc2cb2c
      [4] select(0x12, 0x7fdd056bffd10, 0x0, 0x0, 0x0, 0x5), at 0x7fdd05cc2ced0
      [5] slapd_daemon_task(0x61bf4e9890, 0x0, 0x100382000, 0x2, 0x1004016f0, 0x5), at 0x1000c014c
    (dbx) lwp l@5
    t@5 (l@5) stopped in __pollsys at 0x7fdd05cce13fc
    0x0007fdd05cce13fc: __pollsys+0x0008:   blu      __cerror       ! 0x7fdd05cbdce80
    (dbx) where
    =>[1] __pollsys(0x1, 0x2, 0x0, 0x0, 0x0, 0x0), at 0x7fdd05cce13fc
      [2] _pollsys(0x7fdd055fff340, 0x2, 0x0, 0x0, 0x0, 0x10), at 0x7fdd05cccd6a8
      [3] pselect(0x14, 0x7fdd055fffcf0, 0x7fdd055fff340, 0x7fdd05ce24c78, 0x0, 0x0), at 0x7fdd05cc2cb2c
      [4] select(0x14, 0x7fdd055fffcf0, 0x0, 0x0, 0x0, 0x5), at 0x7fdd05cc2ced0
      [5] slapd_daemon_task(0x61bf4e9890, 0x0, 0x100382000, 0x3, 0x100401d18, 0x5), at 0x1000c014c
    (dbx)
Comment 1 Howard Chu 2020-09-08 14:35:41 UTC
Please recompile without any optimization flags, and with debug symbols.

Would need to see the variables in the slapd_daemon_task frame of the crashing thread.
Comment 2 stacey.marshall 2020-09-11 17:15:05 UTC
Hi,

Sorry for the delay.  Having compiled without optimization and with -g
the stack is as follows:

(dbx) where
current thread: t@2
  [1] __lwp_sigqueue(0x0, 0xa0037948a68, 0x6, 0x0, 0xffffffffffffffff, 0x0), at 0x7fecce46e17a8
  [2] raise(0x6, 0x7feccdfbff0c0, 0x5, 0x5, 0x0, 0x0), at 0x7fecce462af0c
  [3] abort(0x1, 0x1210, 0x0, 0x1000, 0x7fecce483c278, 0x1a278), at 0x7fecce45fa144
  [4] _assert(0x10006ada0, 0x10006adc0, 0x387, 0x0, 0xd3f6300140, 0x7fecce4822000), at 0x7fecce45fb158
=>[5] slapd_add(s = 9, isactive = 0, sl = 0xd3f6300310, id = 1), line 903 in "daemon.c"
  [6] slapd_daemon_task(ptr = 0xd3f6879f60), line 2392 in "daemon.c"

I compiled up a version with LDAP_THREAD_DEBUG which exited with:

 thr_debug.c:434: ldap_pvt_thread_mutex_lock error: usage->magic is 0

Using truss I'm seeing that some of the initialization comes from cyrus-sasl.  
Here we have v2.1.26 (+fixes), while I see 2.1.27 is released (with updates provided by Howard Chu).

Let me know if I can provide anything else from the dump.
Comment 3 Howard Chu 2020-09-11 18:31:46 UTC
(In reply to stacey.marshall from comment #2)
> Hi,
> 
> Sorry for the delay.  Having compiled without optimization and with -g
> the stack is as follows:
> 
> (dbx) where
> current thread: t@2
>   [1] __lwp_sigqueue(0x0, 0xa0037948a68, 0x6, 0x0, 0xffffffffffffffff, 0x0),
> at 0x7fecce46e17a8
>   [2] raise(0x6, 0x7feccdfbff0c0, 0x5, 0x5, 0x0, 0x0), at 0x7fecce462af0c
>   [3] abort(0x1, 0x1210, 0x0, 0x1000, 0x7fecce483c278, 0x1a278), at
> 0x7fecce45fa144
>   [4] _assert(0x10006ada0, 0x10006adc0, 0x387, 0x0, 0xd3f6300140,
> 0x7fecce4822000), at 0x7fecce45fb158
> =>[5] slapd_add(s = 9, isactive = 0, sl = 0xd3f6300310, id = 1), line 903 in
> "daemon.c"
>   [6] slapd_daemon_task(ptr = 0xd3f6879f60), line 2392 in "daemon.c"
> 
> I compiled up a version with LDAP_THREAD_DEBUG which exited with:
> 
>  thr_debug.c:434: ldap_pvt_thread_mutex_lock error: usage->magic is 0

You also have to run this under the debugger, to get a stack trace
for where that error occurred. (thread_debug.c:434 is simply where
that error message is.)

> Using truss I'm seeing that some of the initialization comes from
> cyrus-sasl.  
> Here we have v2.1.26 (+fixes), while I see 2.1.27 is released (with updates
> provided by Howard Chu).
> 
> Let me know if I can provide anything else from the dump.
Comment 4 Howard Chu 2021-01-21 22:09:06 UTC
Isn't this already fixed by ITS#7926?
Comment 5 Quanah Gibson-Mount 2021-01-21 22:44:23 UTC

*** This issue has been marked as a duplicate of issue 7926 ***