[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#3989) syncprov core dumps when combined with uniqueness overlay



Full_Name: Quanah Gibson-Mount
Version: 2.3.6
OS: Solaris 8
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (171.66.182.82)


Today, syncprov crashed (core dumped) on me after a client process made a
modification to the master.


gdb says:

#0  0x000d8c08 in syncprov_op_abandon (op=0x5e3ff540, rs=0x5e3ff4a8) at
syncprov.c:947
947     syncprov.c: No such file or directory.
        in syncprov.c
(gdb) thr apply all bt

Thread 4 (process 68118    ):
#0  0xfee1f340 in _lwp_wait () from /usr/lib/libc.so.1
#1  0xfed5d7b8 in lwp_wait () from /usr/lib/lwp/libthread.so.1
#2  0xfed590a0 in _thrp_join () from /usr/lib/lwp/libthread.so.1
#3  0x000245dc in slapd_daemon () at daemon.c:2045
#4  0x00016a18 in main ()

Thread 3 (process 330262    ):
#0  0x000a0460 in bdb_entry_get (op=0xb5d3a8, ndn=0x5d3fdaf0, oc=0x1b18e0,
at=0x15ac00, rw=0, ent=0x5d3fd56c) at id2entry.c:386
#1  0x00032024 in be_entry_get_rw (op=0x0, ndn=0x5d3fdaf0, oc=0x1b18e0,
at=0x15ac00, rw=0, e=0x5d3fd56c) at backend.c:1194
#2  0x000320c4 in fe_acl_group (op=0xb5d3a8, target=0x5d3ff470,
gr_ndn=0x5d3fdaf0, op_ndn=0xb5d440, group_oc=0x1b18e0, group_at=0x1d6db8) at
backend.c:1239
#3  0x000325a0 in backend_group (op=0xb5d3a8, target=0x5d3ff470,
gr_ndn=0x5d3fdaf0, op_ndn=0xb5d440, group_oc=0x1b18e0, group_at=0x1d6db8) at
backend.c:1390
#4  0x0004497c in slap_acl_mask (a=0x1d58e8, mask=0x5d3fdfb4, op=0xb5d3a8,
e=0x5d3ff470, desc=0x1d85d0, val=0x1886008, nmatch=100, matches=0x5d3fdfb8,
    count=3, state=0x5d3fe9a8) at acl.c:1845
#5  0x00042c50 in access_allowed_mask (op=0xb5d3a8, e=0x5d3ff470, desc=0x225220,
val=0x1886008, access=ACL_WADD, state=0x5d3fe9a8, maskp=0x0) at acl.c:732
#6  0x00045ab8 in acl_check_modlist (op=0xb5d3a8, e=0x5d3ff470, mlist=0x1530b60)
at acl.c:2350
#7  0x00076ad8 in bdb_modify_internal (op=0xb5d3a8, tid=0x16174c0,
modlist=0x1530b60, e=0x5d3ff470, text=0x5d3ffd6c, textbuf=0x5d3ff4b0 "",
textlen=256)
    at modify.c:49
#8  0x0007790c in bdb_modify (op=0xb5d3a8, rs=0x5d3ffd58) at modify.c:467
#9  0x00070b30 in overlay_op_walk (op=0xb5d3a8, rs=0x5d3ffd58, which=32768,
oi=0x151d54, on=0x8000) at backover.c:488
#10 0x00070c24 in over_op_func (op=0xb5d3a8, rs=0x5d3ffd58, which=op_modify) at
backover.c:540
#11 0x0003a540 in fe_op_modify (op=0xb5d3a8, rs=0x5d3ffd58) at modify.c:417
#12 0x00039d48 in do_modify (op=0xb5d3a8, rs=0x5d3ffd58) at modify.c:200
#13 0x00026974 in connection_operation (ctx=0xf2800, arg_v=0xb5d3a8) at
connection.c:1061
#14 0xff31cd38 in ldap_int_thread_pool_wrapper (xpool=0x189018) at tpool.c:478
#15 0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
#16 0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
Previous frame identical to this frame (corrupt stack?)
0x000d8c08      947     in syncprov.c
(gdb) quit


That was @ 9:35 a.m.   The log shows:

Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 215403 local4.debug]
conn=2260 op=0 BIND dn="" method=163
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 588225 local4.debug]
conn=2260 op=0 RESULT tag=97 err=14 text=
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 215403 local4.debug]
conn=2260 op=1 BIND dn="" method=163
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 588225 local4.debug]
conn=2260 op=1 RESULT tag=97 err=14 text=
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 215403 local4.debug]
conn=2260 op=2 BIND dn="" method=163
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 538062 local4.debug]
conn=2260 op=2 BIND authcid="service/slog-people@stanford.edu" authzid="service
/slog-people@stanford.edu"
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 588225 local4.debug]
conn=2260 op=2 RESULT tag=97 err=0 text=
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 690767 local4.debug]
conn=2260 op=2 BIND dn="cn=slog-people,cn=service,cn=applications,dc=stanford,d
c=edu" mech=GSSAPI ssf=56
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 469902 local4.debug]
conn=2260 op=3 SRCH base="cn=People,dc=Stanford,dc=edu" scope=1 deref=3 filter=
"(suRegID=41d9f01ef1d811d2b6d92436000baa77)"
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 167594 local4.debug]
conn=2260 op=3 SEARCH RESULT tag=101 err=0 nentries=1 text=
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 469902 local4.debug]
conn=2260 op=4 SRCH base="cn=People,dc=Stanford,dc=edu" scope=0 deref=3 filter=
"(objectClass=*)"
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 167594 local4.debug]
conn=2260 op=4 SEARCH RESULT tag=101 err=0 nentries=1 text=
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 249368 local4.debug]
conn=2260 op=5 MOD dn="suRegID=41d9f01ef1d811d2b6d92436000baa77,cn=People,dc=St
anford,dc=edu"
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 396994 local4.debug]
conn=2260 op=5 MOD attr=o suprimaryorganizationid sugwaffiliation1 telephonenum
ber sudisplayaffiliation sumailaddress suvisibaffiliation1
suvisibtelephonenumber sugwaffilphone1 suaffiliation suvisibmailaddress
suvisibaffilphone1 sucreateapi homepostaladdress description supermanentaddress
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 167594 local4.debug]
conn=4 op=3 SEARCH RESULT tag=101 err=16640 nentries=0 text=search base has
changed
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 167594 local4.debug]
conn=3 op=3 SEARCH RESULT tag=101 err=16640 nentries=0 text=search base has
changed
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 167594 local4.debug]
conn=2 op=3 SEARCH RESULT tag=101 err=16640 nentries=0 text=search base has
changed
Aug 30 09:35:45 ldap-test0.Stanford.EDU slapd[2582]: [ID 218904 local4.debug]
conn=4 op=4 UNBIND


which puzzles me, because I don't see how the search base was at all changed.

connections 4,3, and 2 are all syncrepl clients.

Then, after having the DB recover, and restarting slapd, it core dumps again the
next time a MODification is made.


Replica's bind:

Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 215403 local4.debug]
conn=3 op=0 BIND dn="cn=ldap-test3,cn=ldap,cn=operational,dc=stanford,dc=edu" m
ethod=163
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 588225 local4.debug]
conn=3 op=0 RESULT tag=97 err=14 text=
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 215403 local4.debug]
conn=3 op=1 BIND dn="cn=ldap-test3,cn=ldap,cn=operational,dc=stanford,dc=edu" m
ethod=163
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 588225 local4.debug]
conn=3 op=1 RESULT tag=97 err=14 text=
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 215403 local4.debug]
conn=3 op=2 BIND dn="cn=ldap-test3,cn=ldap,cn=operational,dc=stanford,dc=edu" m
ethod=163
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 538062 local4.debug]
conn=3 op=2 BIND authcid="ldap/ldap-test3.stanford.edu@stanford.edu" authzid="l
dap/ldap-test3.stanford.edu@stanford.edu"
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 588225 local4.debug]
conn=3 op=2 RESULT tag=97 err=0 text=
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 690767 local4.debug]
conn=3 op=2 BIND dn="cn=ldap-test3,cn=ldap,cn=operational,dc=stanford,dc=edu" m
ech=GSSAPI ssf=56
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 469902 local4.debug]
conn=3 op=3 SRCH base="dc=stanford,dc=edu" scope=2 deref=0 filter="(objectClass
=*)"
Aug 30 14:06:01 ldap-test0.Stanford.EDU slapd[8064]: [ID 744844 local4.debug]
conn=3 op=3 SRCH attr=* +


MODification is made:
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 538062 local4.debug]
conn=56 op=2 BIND authcid="service/slog-people@stanford.edu" authzid="service/s
log-people@stanford.edu"
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 588225 local4.debug]
conn=56 op=2 RESULT tag=97 err=0 text=
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 690767 local4.debug]
conn=56 op=2 BIND dn="cn=slog-people,cn=service,cn=applications,dc=stanford,dc=
edu" mech=GSSAPI ssf=56
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 469902 local4.debug]
conn=56 op=3 SRCH base="cn=People,dc=Stanford,dc=edu" scope=1 deref=3 filter="(
suRegID=05f63c29bcb54f2a8234b39c25ba0d3b)"
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 167594 local4.debug]
conn=56 op=3 SEARCH RESULT tag=101 err=0 nentries=0 text=
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 469902 local4.debug]
conn=56 op=4 SRCH base="cn=People,dc=Stanford,dc=edu" scope=0 deref=3 filter="(
objectClass=*)"
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 167594 local4.debug]
conn=56 op=4 SEARCH RESULT tag=101 err=0 nentries=1 text=
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 585459 local4.debug]
conn=56 op=5 ADD dn="suRegID=05f63c29bcb54f2a8234b39c25ba0d3b,cn=People,dc=Stan
ford,dc=edu"
Aug 30 14:52:11 ldap-test0.Stanford.EDU slapd[8064]: [ID 925615 local4.debug] <=
bdb_equality_candidates: (suUniqueIdentifier) index_param failed (18)


Aug 30 14:55:10 ldap-test0.Stanford.EDU slapd[8064]: [ID 167594 local4.debug]
conn=3 op=3 SEARCH RESULT tag=101 err=16640 nentries=0 text=search base has cha
nged
Aug 30 14:55:10 ldap-test0.Stanford.EDU slapd[8064]: [ID 167594 local4.debug]
conn=1 op=3 SEARCH RESULT tag=101 err=16640 nentries=0 text=search base has cha
nged
Aug 30 14:55:10 ldap-test0.Stanford.EDU slapd[8064]: [ID 167594 local4.debug]
conn=0 op=3 SEARCH RESULT tag=101 err=16640 nentries=0 text=search base has cha
nged
Aug 30 14:55:10 ldap-test0.Stanford.EDU slapd[8064]: [ID 588225 local4.debug]
conn=56 op=5 RESULT tag=103 err=0 text=



An odd thing that immediately jumps out to me here is the index_param error for
the uniqueness overlay attribute "suUniqueIdentifier".  No where in the man page
does it say there needs to be an index on unique attributes, although I
understand why it probably *should* be.

The core dump from the second run is:

(gdb) thr apply all bt

Thread 4 (process 73600    ):
#0  0xfee1f340 in _lwp_wait () from /usr/lib/libc.so.1
#1  0xfed5d7b8 in lwp_wait () from /usr/lib/lwp/libthread.so.1
#2  0xfed590a0 in _thrp_join () from /usr/lib/lwp/libthread.so.1
#3  0x000245dc in slapd_daemon () at daemon.c:2045
#4  0x00016a18 in main ()

Thread 3 (process 335744    ):
#0  0xfedd3744 in tzcpy () from /usr/lib/libc.so.1
#1  0xfedd34fc in _ltzset_u () from /usr/lib/libc.so.1
#2  0xfedd23ec in localtime_u () from /usr/lib/libc.so.1
#3  0xfedd25f4 in localtime_r () from /usr/lib/libc.so.1
#4  0xfedb78ac in ctime_r () from /usr/lib/libc.so.1
#5  0xfedd186c in vsyslog () from /usr/lib/libc.so.1
#6  0xfedd168c in syslog () from /usr/lib/libc.so.1
#7  0x00040414 in do_unbind (op=0x45911c8, rs=0x5d3ffd58) at unbind.c:53
#8  0x00026974 in connection_operation (ctx=0x0, arg_v=0x45911c8) at
connection.c:1061
#9  0xff31cd38 in ldap_int_thread_pool_wrapper (xpool=0x189018) at tpool.c:478
#10 0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
#11 0xfed658c8 in _lwp_start () from /usr/lib/lwp/libthread.so.1
Previous frame identical to this frame (corrupt stack?)
0x000d8c08      947     in syncprov.c


My guess is that something about the uniqueness overlay is horribly interfering
with the syncprov application.

--Quanah