[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: null_callbacks after initial sync

Howard Chu wrote:
> Nick Geron wrote:
>> Howard Chu wrote:
>>> OK, this sounds like the background thread to propagate updates isn't
>>> getting scheduled when it should. That could be a bug in the syncprov
>>> overlay.
>> Should I file a report, and if so, what information is required from
>> this end?
> I've filed this as ITS#5405 and fixed it in CVS HEAD. It would be good
> if you could test the patch and report your results as a followup to
> the ITS.

I'll try to look into this soon.
>> Thanks for the info.  That would certainly point to a problem with my
>> build environment, or a new bug.
>>>> I'm seeing a number of aborts when
>>>> testing under high load.  The latest came from running scripted
>>>> ldapsearches and ldapmodifies which resulted in a mutex error (or so I
>>>> am told by one of our developers).
>>>> Specifically:
>>>> 1) adding about 100 attributes to an entry
>>>> 2) diffing the output of ldapsearch between the two nodes in loop
>>>> 3) once synced, grabbing the attributes, shoving them in a temp file
>>>> with delete instructions and using that with ldapmodify.
>>>> I complied with debugging on which results in an abort with
>>>> "connection.c: 676: connection_state_closing: Assertion
>>>> 'c_struct_state
>>>> == 0x02' failed" logged.
>>> Interesting. It would be useful to get a gdb stack trace from that
>>> situation.
>> Yesterday I was able to successfully reproduce this beahvior at least
>> three times.  This morning, I was able to reproduce it with the above
>> steps yet again.  From a gdb session, no backtrace was available,
>> however.  I then recompiled with debugging enabled and was unable to
>> reproduce the bug until I added '-d 7' to the run arguments.  It should
>> be noted that before recompiling, I was able to reproduce the behavior
>> with and without the command line debug argument.
>> Here's the stack trace from a gdb session with arguments, -h 'ldap:///
>> ldaps:///' -d 7
>> =>  acl_string_expand: expanded:
>> uid=[^,]+,ou=employees,ou=people,dc=example,dc=com
>> =>  regex_matches: string:        uid=syncrepl,ou=ldap,dc=example,dc=com
>> =>  regex_matches: rc: 1 no matches
>> slapd: connection.c:676: connection_state_closing: Assertion
>> `c->c_struct_state == 0x02' failed.
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 1124096320 (LWP 7301)]
>> 0x0000003918230055 in raise () from /lib64/libc.so.6
>> (gdb) bt
>> #0  0x0000003918230055 in raise () from /lib64/libc.so.6
>> #1  0x0000003918231af0 in abort () from /lib64/libc.so.6
>> #2  0x0000003918229756 in __assert_fail () from /lib64/libc.so.6
>> #3  0x000000000042d345 in connection_state_closing ()
>> #4  0x000000000043d43b in slap_freeself_cb ()
>> #5  0x000000000043ef81 in slap_send_search_entry ()
>> #6  0x00000000004c88c4 in syncprov_initialize ()
>> #7  0x00002aaaaaabc1c7 in ldap_int_thread_pool_wrapper
>> (xpool=0x1a0b06d0) at tpool.c:625
>> #8  0x00000039196062f7 in start_thread () from /lib64/libpthread.so.0
>> #9  0x00000039182ce85d in clone () from /lib64/libc.so.6
>> (gdb)
>> Envrionment = CentOS 5 updated to whatever RH thinks is current as of
>> last week.  Oracle db 4.5.20 and openldap 2.4.7 compiled by hand.
>> Please let me know if any further information would be helpful.
> Your stack trace is a bit odd, because I can't find anywhere in the
> source tree that uses the function "slap_freeself_cb()". Are you using
> any custom overlays? It appears that your stack trace is still missing
> a lot of details. You should compile with -g and without any
> optimization, and make sure you're testing with the unstripped binary.
The only overlays I'm using are those found in the openldap tarball.  I
found that function referenced in result.c

[root@server01 openldap-2.4.7]# grep -R slap_freeself_cb .
Binary file ./servers/slapd/.libs/slapd matches
./servers/slapd/proto-slap.h:LDAP_SLAPD_F (int) slap_freeself_cb
LDAP_P(( Operation *op, SlapReply *rs ));
Binary file ./servers/slapd/result.o matches
./servers/slapd/result.c:int slap_freeself_cb( Operation *op, SlapReply
*rs )
[root@server01 openldap-2.4.7]# strings /usr/local/libexec/slapd | grep
[root@server01 openldap-2.4.7]#

It does look like I had stripped binaries though.  I've removed
optimization and verified that the -g option is being given.  It took a
bit to track down where stripping was being set, but 'file' now tells me
the binaries are not stripped.

Here's the latest backtrace:

> regex_matches: string:        uid=syncrepl,ou=ldap,dc=corenap,dc=com
=> regex_matches: rc: 1 no matches
ldap_read: want=8, got=7
  0000:  30 05 02 01 03 42 00                               0....B.
ber_get_next: tag 0x30 len 5 contents:
ldap_read: want=8, got=0

ber_get_next on fd 21 failed errno=0 (Success)
connection_closing: readying conn=1 sd=21 for close
connection_close: deferring conn=1 sd=21
conn=1 op=2 do_unbind
connection_resched: attempting closing conn=1 sd=21
connection_close: conn=1 sd=21
slapd: connection.c:676: connection_state_closing: Assertion
`c->c_struct_state == 0x02' failed.

Program received signal SIGABRT, Aborted.
[Switching to Thread 1107310912 (LWP 17114)]
0x00000030ae430055 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00000030ae430055 in raise () from /lib64/libc.so.6
#1  0x00000030ae431af0 in abort () from /lib64/libc.so.6
#2  0x00000030ae429756 in __assert_fail () from /lib64/libc.so.6
#3  0x000000000043854a in connection_state_closing (c=0xa35ef60) at
#4  0x000000000044dc38 in send_ldap_ber (conn=0xa35ef60, ber=0x42002410)
at result.c:153
#5  0x000000000045173d in slap_send_search_entry (op=0x420029a0,
rs=0x420026c0) at result.c:1187
#6  0x000000000051072f in syncprov_sendresp (op=0x420029a0,
opc=0x42002780, so=0xa882b80, e=0x420027d8, mode=2) at syncprov.c:809
#7  0x0000000000510a8d in syncprov_qplay (op=0x420029a0, on=0xa094690,
so=0xa882b80) at syncprov.c:878
#8  0x0000000000510c54 in syncprov_qtask (ctx=0x42002dc0,
arg=0x2aaac4111500) at syncprov.c:923
#9  0x00002aaaaaabc605 in ldap_int_thread_pool_wrapper (xpool=0xa0466d0)
at tpool.c:625
#10 0x00000030aec062f7 in start_thread () from /lib64/libpthread.so.0
#11 0x00000030ae4ce85d in clone () from /lib64/libc.so.6