[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: hang due to WAKE_LISTENER in slapd/daemon.c



> Version:             OPENLDAP_REL_ENG_2_0_ALPHA3
> OS:  Solaris        7
> Thread Model:    no threads
>
> Hi All,
>
> I discovered that under certain, admittedly bizarre, circumstances
> slapd can hang sending to the wake_sds socket pair (in the
> WAKE_LISTENER macro.  I think that the buffering for the pair is
> filling up and slapd gets permanently stuck in write.
> Here's the stack trace:
>
> #0  0xff218008 in _write () from /usr/lib/libc.so.1
> #1  0x20a84 in slapd_set_write (s=28, wake=1) at daemon.c:142
> #2  0x2bd8c in send_ldap_ber (conn=0xd57e0, ber=0x2d17e68) at result.c:202
> #3  0x2d050 in send_search_entry (be=0xc0068, conn=0xd57e0, op=0x2d07240,
>     e=0x2df2e00, attrs=0x2cd5870, attrsonly=0, ctrls=0x0) at result.c:707
> #4  0x3cb1c in ldbm_back_search (be=0xc0068, conn=0xd57e0, op=0x2d07240,
>     base=0x2df2e00 "", scope=1, deref=3, slimit=0, tlimit=0,
> filter=0x2f633b0,
>     filterstr=0x2f57ff8 "(objectclass=*)", attrs=0x2cd5870, attrsonly=0)
>     at search.c:284
> #5  0x251dc in do_search (conn=0xd57e0, op=0x2d07240) at search.c:209
> #6  0x23e04 in connection_operation (arg_v=0x2d14a40) at connection.c:745
> #7  0x52b10 in ldap_pvt_thread_create (thread=0x2d07248, detach=1,
>     start_routine=0x23d0c <connection_operation>, arg=0x2d14a40)
>     at thr_stub.c:48
> #8  0x248e8 in connection_op_activate (conn=0xd57e0, op=0x2d07240)
>     at connection.c:1075
> #9  0x246a4 in connection_input (conn=0xd57e0) at connection.c:979
> #10 0x24268 in connection_read (s=230) at connection.c:880
> #11 0x22424 in slapd_daemon_task (ptr=0x0) at daemon.c:849
> #12 0x52b10 in ldap_pvt_thread_create (thread=0xbb408, detach=0,
>     start_routine=0x216e0 <slapd_daemon_task>, arg=0x0) at thr_stub.c:48
> #13 0x22638 in slapd_daemon () at daemon.c:901
> #14 0x20428 in main (argc=769320, argv=0x99400) at main.c:433
>
> I haven't duplicated this exact problem but I did do some tests that
> indicate that even under moderate load multiple bytes can accumulate
> in the wake_sds pair.  This could be fixed easily by limiting the number
> of bytes that can be written to the pair using a counter and completely
> emptying the sock-pair on each select hit.  If there are no objections
> I'll submit a fix with an ITS entry.
>
> -Jeff Romine

I'm not sure I like the idea of limiting the number of bytes written,
but definitely agree that the sock-pair should be emptied on each
select hit. It amazes me that you managed to fill up the 4096 byte buffer
on the socketpair...