[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3989) syncprov core dumps when combined with uniqueness overlay




--On Wednesday, August 31, 2005 2:33 AM -0700 Howard Chu <hyc@symas.com> 
wrote:

> The backtraces attached here are incomplete. Note that even though you
> specified "thread apply all" in the backtrace command, gdb stopped
> tracing after thread 3 because gdb thought that thread's stack trace was
> corrupted. You have to manually issue backtrace commands for threads 2
> and 1 in this case in order to get a complete picture of what's going on.
> (Really I believe gdb is just being stupid here and not able to detect
> that it reached the top of thread 3's stack; there obviously cannot be
> anything beyond lwp_start. Perhaps a newer gdb would recognize it.)
>
> quanah@stanford.edu wrote:
>> gdb says:
>>
>> # 0  0x000d8c08 in syncprov_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at
>> syncprov.c:947
>> 947     syncprov.c: No such file or directory.
>>         in syncprov.c
>> (gdb) thr apply all bt
>>
>> Thread 4 (process 68118    ):
>> # 0  0xfee1f340 in _lwp_wait () from /usr/lib/libc.so.1
>> # 1  0xfed5d7b8 in lwp_wait () from /usr/lib/lwp/libthread.so.1
>> # 2  0xfed590a0 in _thrp_join () from /usr/lib/lwp/libthread.so.1
>> # 3  0x000245dc in slapd_daemon () at daemon.c:2045
>> # 4  0x00016a18 in main ()
>>
>> Thread 3 (process 330262    ):
>> # 0  0x000a0460 in bdb_entry_get (op=0xb5d3a8, ndn=0x5d3fdaf0,
>> # oc=0x1b18e0,
>> at=0x15ac00, rw=0, ent=0x5d3fd56c) at id2entry.c:386
>> # 1  0x00032024 in be_entry_get_rw (op=0x0, ndn=0x5d3fdaf0, oc=0x1b18e0,
>> at=0x15ac00, rw=0, e=0x5d3fd56c) at backend.c:1194
>> # 2  0x000320c4 in fe_acl_group (op=0xb5d3a8, target=0x5d3ff470,
>> gr_ndn=0x5d3fdaf0, op_ndn=0xb5d440, group_oc=0x1b18e0,
>> group_at=0x1d6db8) at backend.c:1239
>> # 3  0x000325a0 in backend_group (op=0xb5d3a8, target=0x5d3ff470,
>> gr_ndn=0x5d3fdaf0, op_ndn=0xb5d440, group_oc=0x1b18e0,
>> group_at=0x1d6db8) at backend.c:1390
>> # 4  0x0004497c in slap_acl_mask (a=0x1d58e8, mask=0x5d3fdfb4,
>> # op=0xb5d3a8,
>> e=0x5d3ff470, desc=0x1d85d0, val=0x1886008, nmatch=100,
>> matches=0x5d3fdfb8, count=3, state=0x5d3fe9a8) at acl.c:1845
>> # 5  0x00042c50 in access_allowed_mask (op=0xb5d3a8, e=0x5d3ff470,
>> # desc=0x225220,
>> val=0x1886008, access=ACL_WADD, state=0x5d3fe9a8, maskp=0x0) at acl.c:732
>> # 6  0x00045ab8 in acl_check_modlist (op=0xb5d3a8, e=0x5d3ff470,
>> # mlist=0x1530b60)
>> at acl.c:2350
>> # 7  0x00076ad8 in bdb_modify_internal (op=0xb5d3a8, tid=0x16174c0,
>> modlist=0x1530b60, e=0x5d3ff470, text=0x5d3ffd6c, textbuf=0x5d3ff4b0 "",
>> textlen=256)
>>     at modify.c:49
>> # 8  0x0007790c in bdb_modify (op=0xb5d3a8, rs=0x5d3ffd58) at
>> # modify.c:467 9  0x00070b30 in overlay_op_walk (op=0xb5d3a8,
>> # rs=0x5d3ffd58, which=32768,
>> oi=0x151d54, on=0x8000) at backover.c:488
>> # 10 0x00070c24 in over_op_func (op=0xb5d3a8, rs=0x5d3ffd58,
>> # which=op_modify) at
>> backover.c:540
>> # 11 0x0003a540 in fe_op_modify (op=0xb5d3a8, rs=0x5d3ffd58) at
>> # modify.c:417 12 0x00039d48 in do_modify (op=0xb5d3a8, rs=0x5d3ffd58)
>> # at modify.c:200 13 0x00026974 in connection_operation (ctx=0xf2800,
>> # arg_v=0xb5d3a8) at
>> connection.c:1061
>> # 14 0xff31cd38 in ldap_int_thread_pool_wrapper (xpool=0x189018) at
>> # tpool.c:478 15 0xfed658c8 in _lwp_start () from
>> # /usr/lib/lwp/libthread.so.1 16 0xfed658c8 in _lwp_start () from
>> # /usr/lib/lwp/libthread.so.1
>> Previous frame identical to this frame (corrupt stack?)
>> 0x000d8c08      947     in syncprov.c
>> (gdb) quit


Sadly, this is my newer version of gdb (6.3) with LFS support (these core 
files are around 2.7GB).

Core1:

(gdb) thread 2
[Switching to thread 2 (process 264726    )]#0  0xfee1d4dc in __open () 
from /usr/lib/libc.so.1
(gdb) bt
#0  0xfee1d4dc in __open () from /usr/lib/libc.so.1
#1  0xfee164a4 in _open () from /usr/lib/libc.so.1
#2  0xfedd1d94 in syslogd_ok () from /usr/lib/libc.so.1
#3  0xfedd1bc4 in vsyslog () from /usr/lib/libc.so.1
#4  0xfedd168c in syslog () from /usr/lib/libc.so.1
#5  0x00040414 in do_unbind (op=0x2178dc8, rs=0x5dbffd58) at unbind.c:53
#6  0x00026974 in connection_operation (ctx=0x0, arg_v=0x2178dc8) at 
connection.c:1061
#7  0xff31cd38 in ldap_int_thread_pool_wrapper (xpool=0x189018) at 
tpool.c:478
#8  0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
#9  0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
Previous frame identical to this frame (corrupt stack?)


(gdb) thread 1
[Switching to thread 1 (process 199190    )]#0  0x000d8c08 in 
syncprov_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at syncprov.c:947
947     in syncprov.c
(gdb) bt
#0  0x000d8c08 in syncprov_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at 
syncprov.c:947
#1  0x00070b48 in overlay_op_walk (op=0x5e3ff540, rs=0x5e3ff4a8, 
which=32768, oi=0x1a8e30, on=0x1a8f20) at backover.c:480
#2  0x00070c24 in over_op_func (op=0x5e3ff540, rs=0x5e3ff4a8, 
which=op_abandon) at backover.c:540
#3  0x000409e4 in fe_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at 
abandon.c:115
#4  0x00025e78 in connection_abandon (c=0xb487b8) at connection.c:740
#5  0x00025f84 in connection_closing (c=0xb487b8, why=0xef878 "connection 
lost") at connection.c:775
#6  0x00026c70 in connection_read (s=17) at connection.c:1301
#7  0x00023f44 in slapd_daemon_task (ptr=0xeac00) at daemon.c:1879
#8  0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
#9  0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
Previous frame identical to this frame (corrupt stack?)



Core 2:

(gdb) thread 2
[Switching to thread 2 (process 270208    )]#0  0xfee1e7ac in _putmsg () 
from /usr/lib/libc.so.1
(gdb) bt
#0  0xfee1e7ac in _putmsg () from /usr/lib/libc.so.1
#1  0xfedd1bb0 in vsyslog () from /usr/lib/libc.so.1
#2  0xfedd168c in syslog () from /usr/lib/libc.so.1
#3  0x00033f98 in slap_send_ldap_result (op=0xb5c8e0, rs=0x5dbffd58) at 
result.c:596
#4  0x00096b14 in bdb_add (op=0xb5c8e0, rs=0x5dbffd58) at add.c:432
#5  0x00070b30 in overlay_op_walk (op=0xb5c8e0, rs=0x5dbffd58, which=32768, 
oi=0x151d54, on=0x8000) at backover.c:488
#6  0x00070c24 in over_op_func (op=0xb5c8e0, rs=0x5dbffd58, which=op_add) 
at backover.c:540
#7  0x0002c720 in fe_op_add (op=0xb5c8e0, rs=0x5dbffd58) at add.c:346
#8  0x0002bfd8 in do_add (op=0xb5c8e0, rs=0x5dbffd58) at add.c:178
#9  0x00026974 in connection_operation (ctx=0x152000, arg_v=0xb5c8e0) at 
connection.c:1061
#10 0xff31cd38 in ldap_int_thread_pool_wrapper (xpool=0x189018) at 
tpool.c:478
#11 0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
#12 0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
Previous frame identical to this frame (corrupt stack?)

[Switching to thread 1 (process 204672    )]#0  0x000d8c08 in 
syncprov_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at syncprov.c:947
947     syncprov.c: No such file or directory.
        in syncprov.c
(gdb) bt
#0  0x000d8c08 in syncprov_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at 
syncprov.c:947
#1  0x00070b48 in overlay_op_walk (op=0x5e3ff540, rs=0x5e3ff4a8, 
which=32768, oi=0x1a8e30, on=0x1a8f20) at backover.c:480
#2  0x00070c24 in over_op_func (op=0x5e3ff540, rs=0x5e3ff4a8, 
which=op_abandon) at backover.c:540
#3  0x000409e4 in fe_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at 
abandon.c:115
#4  0x00025e78 in connection_abandon (c=0xb486d8) at connection.c:740
#5  0x00025f84 in connection_closing (c=0xb486d8, why=0xef878 "connection 
lost") at connection.c:775
#6  0x00026c70 in connection_read (s=16) at connection.c:1301
#7  0x00023f44 in slapd_daemon_task (ptr=0xeac00) at daemon.c:1879
#8  0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
#9  0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
Previous frame identical to this frame (corrupt stack?)



The interesting part to me in the log files for this problem is the bit 
where it complains about the search base being changed on it...

My configuration for the uniqueness overlay has:

# Uniqueness Overlay
overlay unique
unique_base cn=people,dc=stanford,dc=edu
unique_attributes suunivid suproxycardnumber sucardnumber suuniqueidentifier

and I wonder if the unique_base bit is tweaking a value that syncprov uses 
for its base?  Or I could be totally off. :)

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>