[Date Prev][Date Next]
Re: null_callbacks after initial sync
Howard Chu wrote:
> Nick Geron wrote:
>> Howard Chu wrote:
>>> OK, this sounds like the background thread to propagate updates isn't
>>> getting scheduled when it should. That could be a bug in the syncprov
>> Should I file a report, and if so, what information is required from
>> this end?
> I've filed this as ITS#5405 and fixed it in CVS HEAD. It would be good
> if you could test the patch and report your results as a followup to
> the ITS.
I'll try to look into this soon.
>> Thanks for the info. That would certainly point to a problem with my
>> build environment, or a new bug.
>>>> I'm seeing a number of aborts when
>>>> testing under high load. The latest came from running scripted
>>>> ldapsearches and ldapmodifies which resulted in a mutex error (or so I
>>>> am told by one of our developers).
>>>> 1) adding about 100 attributes to an entry
>>>> 2) diffing the output of ldapsearch between the two nodes in loop
>>>> 3) once synced, grabbing the attributes, shoving them in a temp file
>>>> with delete instructions and using that with ldapmodify.
>>>> I complied with debugging on which results in an abort with
>>>> "connection.c: 676: connection_state_closing: Assertion
>>>> == 0x02' failed" logged.
>>> Interesting. It would be useful to get a gdb stack trace from that
>> Yesterday I was able to successfully reproduce this beahvior at least
>> three times. This morning, I was able to reproduce it with the above
>> steps yet again. From a gdb session, no backtrace was available,
>> however. I then recompiled with debugging enabled and was unable to
>> reproduce the bug until I added '-d 7' to the run arguments. It should
>> be noted that before recompiling, I was able to reproduce the behavior
>> with and without the command line debug argument.
>> Here's the stack trace from a gdb session with arguments, -h 'ldap:///
>> ldaps:///' -d 7
>> => acl_string_expand: expanded:
>> => regex_matches: string: uid=syncrepl,ou=ldap,dc=example,dc=com
>> => regex_matches: rc: 1 no matches
>> slapd: connection.c:676: connection_state_closing: Assertion
>> `c->c_struct_state == 0x02' failed.
>> Program received signal SIGABRT, Aborted.
>> [Switching to Thread 1124096320 (LWP 7301)]
>> 0x0000003918230055 in raise () from /lib64/libc.so.6
>> (gdb) bt
>> #0 0x0000003918230055 in raise () from /lib64/libc.so.6
>> #1 0x0000003918231af0 in abort () from /lib64/libc.so.6
>> #2 0x0000003918229756 in __assert_fail () from /lib64/libc.so.6
>> #3 0x000000000042d345 in connection_state_closing ()
>> #4 0x000000000043d43b in slap_freeself_cb ()
>> #5 0x000000000043ef81 in slap_send_search_entry ()
>> #6 0x00000000004c88c4 in syncprov_initialize ()
>> #7 0x00002aaaaaabc1c7 in ldap_int_thread_pool_wrapper
>> (xpool=0x1a0b06d0) at tpool.c:625
>> #8 0x00000039196062f7 in start_thread () from /lib64/libpthread.so.0
>> #9 0x00000039182ce85d in clone () from /lib64/libc.so.6
>> Envrionment = CentOS 5 updated to whatever RH thinks is current as of
>> last week. Oracle db 4.5.20 and openldap 2.4.7 compiled by hand.
>> Please let me know if any further information would be helpful.
> Your stack trace is a bit odd, because I can't find anywhere in the
> source tree that uses the function "slap_freeself_cb()". Are you using
> any custom overlays? It appears that your stack trace is still missing
> a lot of details. You should compile with -g and without any
> optimization, and make sure you're testing with the unstripped binary.
The only overlays I'm using are those found in the openldap tarball. I
found that function referenced in result.c
[root@server01 openldap-2.4.7]# grep -R slap_freeself_cb .
Binary file ./servers/slapd/.libs/slapd matches
./servers/slapd/proto-slap.h:LDAP_SLAPD_F (int) slap_freeself_cb
LDAP_P(( Operation *op, SlapReply *rs ));
Binary file ./servers/slapd/result.o matches
./servers/slapd/result.c:int slap_freeself_cb( Operation *op, SlapReply
[root@server01 openldap-2.4.7]# strings /usr/local/libexec/slapd | grep
It does look like I had stripped binaries though. I've removed
optimization and verified that the -g option is being given. It took a
bit to track down where stripping was being set, but 'file' now tells me
the binaries are not stripped.
Here's the latest backtrace:
> regex_matches: string: uid=syncrepl,ou=ldap,dc=corenap,dc=com
=> regex_matches: rc: 1 no matches
ldap_read: want=8, got=7
0000: 30 05 02 01 03 42 00 0....B.
ber_get_next: tag 0x30 len 5 contents:
ldap_read: want=8, got=0
ber_get_next on fd 21 failed errno=0 (Success)
connection_closing: readying conn=1 sd=21 for close
connection_close: deferring conn=1 sd=21
conn=1 op=2 do_unbind
connection_resched: attempting closing conn=1 sd=21
connection_close: conn=1 sd=21
slapd: connection.c:676: connection_state_closing: Assertion
`c->c_struct_state == 0x02' failed.
Program received signal SIGABRT, Aborted.
[Switching to Thread 1107310912 (LWP 17114)]
0x00000030ae430055 in raise () from /lib64/libc.so.6
#0 0x00000030ae430055 in raise () from /lib64/libc.so.6
#1 0x00000030ae431af0 in abort () from /lib64/libc.so.6
#2 0x00000030ae429756 in __assert_fail () from /lib64/libc.so.6
#3 0x000000000043854a in connection_state_closing (c=0xa35ef60) at
#4 0x000000000044dc38 in send_ldap_ber (conn=0xa35ef60, ber=0x42002410)
#5 0x000000000045173d in slap_send_search_entry (op=0x420029a0,
rs=0x420026c0) at result.c:1187
#6 0x000000000051072f in syncprov_sendresp (op=0x420029a0,
opc=0x42002780, so=0xa882b80, e=0x420027d8, mode=2) at syncprov.c:809
#7 0x0000000000510a8d in syncprov_qplay (op=0x420029a0, on=0xa094690,
so=0xa882b80) at syncprov.c:878
#8 0x0000000000510c54 in syncprov_qtask (ctx=0x42002dc0,
arg=0x2aaac4111500) at syncprov.c:923
#9 0x00002aaaaaabc605 in ldap_int_thread_pool_wrapper (xpool=0xa0466d0)
#10 0x00000030aec062f7 in start_thread () from /lib64/libpthread.so.0
#11 0x00000030ae4ce85d in clone () from /lib64/libc.so.6