[Date Prev][Date Next] [Chronological] [Thread] [Top]

SIGSEV in slapd (ad.c) on Solaris 10




I'm still seeing SIGSEV's regularly on the busier of our ldap servers (800K connections/day, 20-20K search ops per connection)


What I can't get is a reproducible case. It happens far more often on the busier server and not at all
on the least loaded servers. It's only happening under SPARC, using the debian install on x86 hardware
seems stable. Any pointers for getting a reproducable case, and what should I be trying to get out of
the core files above the basic trace below? (Parse the logfiles with perl and replay?)


If anyone else is using a recent 2.3.x or 2.4 build on Solaris, can they tell me if my build environment
is sane before I try upgrading to 2.4.10?


Cheers,
         Duncan

DB_CONFIG
#
set_cachesize   0       209715200        1
set_lg_dir /trans_logs

set_lg_regionmax 262144
set_lg_bsize 2097152
set_lg_max 16777216

set_tmp_dir /tmp
set_flags DB_LOG_AUTOREMOVE
---

Build Environment

LDFLAGS=-R/usr/sfw/lib:/usr/local/ssl/lib:/usr/local/lib -L/usr/local/lib -L/usr/local/ssl/lib -L/usr/sfw/lib
CPPFLAGS=-I/usr/local/include -I/usr/local/ssl/include -I/usr/sfw/include
LD=/usr/ccs/bin/ld -R/usr/sfw/lib:/usr/local/ssl/lib:/usr/local/lib


setenv PATH "/usr/ccs/bin/:/usr/bin:/bin:/usr/sbin:/opt/local/bin:/usr/sfw/bin:/usr/local/bin"
setenv LDFLAGS "-R/usr/sfw/lib:/usr/local/ssl/lib:/usr/local/lib -L/usr/local/lib -L/usr/local/ssl/lib -L/usr/sfw/lib"
setenv CPPFLAGS "-I/usr/local/include -I/usr/local/ssl/include -I/usr/sfw/include"
setenv LD "/usr/ccs/bin/ld -R/usr/sfw/lib:/usr/local/ssl/lib:/usr/local/lib"
---


Configured with
./configure --disable-ipv6 --with-cyrus-sasl --with-tls --enable-dynamic --enable-slapd --enable-modules --enable-spasswd --enable-rewrite --enable-rlookups --enable-wrappers --enable-hdb --enable-monitor
--disable-shell --disable-sql --enable-overlays=mod --enable-crypt
---


Compiled with gcc 3.4.3, I've tried Sun Studio as well but experienced similar crashes.


dbx gives me the following output

For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.4' in your .dbxrc
Reading slapd
core file header read successfully
Reading ld.so.1
Reading libldap_r-2.4.so.2.0.4
Reading liblber-2.4.so.2.0.4
Reading libltdl.so.3.1.5
Reading libdb-4.6.so
Reading librt.so.1
Reading libpthread.so.1
Reading libicuuc.so.2
Reading libicudata.so.2
Reading libsasl2.so.2.0.22
Reading libdl.so.1
Reading libssl.so.0.9.8
Reading libcrypto.so.0.9.8
Reading libresolv.so.2
Reading libgen.so.1
Reading libnsl.so.1
Reading libsocket.so.1
Reading libc.so.1
Reading libgcc_s.so.1
Reading libgcc_s.so.1
Reading libaio.so.1
Reading libmd5.so.1
Reading libm.so.2
Reading libCrun.so.1
Reading libc_psr.so.1
Reading libgssapiv2.so.2.0.22
Reading libgssapi.so.4.0.0
Reading libkrb5.so.17.4.0
Reading libasn1.so.6.1.0
Reading libroken.so.16.1.0
Reading libcom_err.so.1.1.3
Reading libncurses.so.5.4
Reading libdoor.so.1
Reading libscf.so.1
Reading libuutil.so.1
Reading libmd5_psr.so.1
Reading libmp.so.2
Reading liblogin.so.2.0.22
Reading libplain.so.2.0.22
Reading syncprov-2.4.so.2.0.4
t@1 (l@1) terminated by signal KILL (Killed)
0xfe4bd61c: __lwp_wait+0x0008: bcc,a,pt %icc,__lwp_wait+0x18 ! 0xfe4bd62c
Current function is slapd_daemon
2644 ldap_pvt_thread_join( listener_tid, (void *)NULL );
(dbx) threads
> t@1 a l@1 ?() LWP suspended in __lwp_wait()
t@3 a l@3 slapd_daemon_task() LWP suspended in __pollsys()
t@4 a l@4 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@5 a l@5 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@6 a l@6 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@7 a l@7 ldap_int_thread_pool_wrapper() LWP suspended in ch_malloc()
t@8 a l@8 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@9 a l@9 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@10 a l@10 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@11 a l@11 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
t@12 a l@12 ldap_int_thread_pool_wrapper() sleep on 0x1dad38 in __lwp_park()
o t@13 a l@13 ldap_int_thread_pool_wrapper() signal SIGSEGV in is_ad_subtype()
(dbx) thread t@13
t@13 (l@13) stopped in is_ad_subtype at line 489 in file "ad.c"
489 for ( a = sub->ad_type; a; a=a->sat_sup ) {
(dbx) where
current thread: t@13
=>[1] is_ad_subtype(sub = (nil), super = 0x1f81b0), line 489 in "ad.c"
[2] attrs_find(a = 0x499014, desc = 0x1f81b0), line 647 in "attr.c"
[3] test_ava_filter(op = 0x6645a90, e = 0x25738c, ava = 0x9eddee4, type = 163), line 615 in "filterentry.c"
[4] test_filter(op = 0x6645a90, e = 0x25738c, f = 0x9eddf04), line 98 in "filterentry.c"
[5] hdb_search(op = 0x6645a90, rs = 0xe87ffcb8), line 845 in "search.c"
[6] overlay_op_walk(op = 0x6645a90, rs = 0xe87ffcb8, which = 32768, oi = 0x15c540, on = 0x8000), line 653 in "backover.c"
[7] over_op_func(op = 0x6645a90, rs = 0xe87ffcb8, which = op_search), line 705 in "backover.c"
[8] fe_op_search(op = 0x6645a90, rs = 0xe87ffcb8), line 368 in "search.c"
[9] do_search(op = 0x6645a90, rs = 0xe87ffcb8), line 217 in "search.c"
[10] connection_operation(ctx = 0xe87ffe08, arg_v = 0x6645a90), line 1084 in "connection.c"
[11] connection_read_thread(ctx = 0xe87ffe08, argv = 0xe), line 1211 in "connection.c"
[12] ldap_int_thread_pool_wrapper(xpool = 0x1dad18), line 625 in "tpool.c"
(dbx) lsit
lsit: not found
(dbx) list
489 for ( a = sub->ad_type; a; a=a->sat_sup ) {
490 if ( a == super->ad_type ) break;
491 }
492 if( !a ) {
493 return 0;
494 }
495 496 /* ensure sub does support all flags of super */
497 lr = sub->ad_tags.bv_len ? SLAP_DESC_TAG_RANGE : 0;
498 if(( super->ad_flags & ( sub->ad_flags | lr )) != super->ad_flags ) {
(dbx) print *a
dbx: reference through nil pointer
(dbx) quit


--
The University of St Andrews is a charity registered in Scotland : No SC013532