[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Bugfix: wait4msg() hangs when using SSL/TLS (ITS#446)

Forwarded to devel for discussion....

At 02:35 AM 3/20/00 +0000, Andrew Hacking wrote:
>I guess this means the socket is in 'blocking' mode, which would always cause a
>hang when reading a byte that is not yet available (as I think you have shown).

Yes, but why did it attempt to read a PDU from the slave?  It appears to me
that your change caused the library to read from the wrong connection.

>This is quite serious, since even if a select() is performed to see if _some_ data
>is available BEFORE calling try_read1msg(), the stream_read() will hang if not
>enough bytes have been queued on the socket when attempting to parse the message.


>I always thought/assumed (I know, assumption is never a good thing) that sockbuf's
>used non-blocking sockets for this very reason... to avoid hanging indefinitely
>when performing a read/recv.

No.  By default the library uses blocking sockets.

>This means that performing a read/recv on a
>non-blocking socket when data is unavailable returns failure with errno set to
>EWOULDBLOCK/EAGAIN, then select() would be called to wait for the arrival of more
>data, the timeout checked, and the read/recv tried again.

Only if the caller marks the sockets non-blocking.

>So the question is, why are blocking sockets being used, or has it always been this
>way and no-one noticed ?

Because UNIX socket model is, by default, blocking.

>I am a little concerned because libldap and liblber appear to be written around the
>notion of non-blocking sockets.

libldap and liblber support both blocking and non-blocking sockets.

>If blocking sockets are used, the timeout that the caller provides to libldap will
>never be honoured when say, the server takes some time to respond, or worse it
>crashes, it will even hang when network congestion/disruption occurs.

The timeout applies only to the select, not to the read.  Once a PDU read is
initiated, it must be completed (or the stream closed).  The API doesn't
support restarting of PDU reads.

>It is under
>all these conditions that the timeout ensures libldap only blocks for a definite
>amount of time.

>Kurt@OpenLDAP.org wrote:
>> I've experimented with this patch a bit (without TLS) and found
>> that it causes test009-referral to hang.  ldapsearch has data
>> pending from the master but is blocked reading from the slave.
>> (gdb) where
>> #0  0x18183f44 in read () from /usr/lib/libc.so.3
>> #1  0x805a3d5 in stream_read (sb=0x806a000, buf=0x806718c, len=1)
>>     at ../../../libraries/liblber/sockbuf.c:911
>> #2  0x8059df3 in ber_pvt_sb_read (sb=0x806a000, buf_arg=0x806718c, len=1)
>>     at ../../../libraries/liblber/sockbuf.c:504
>> #3  0x8058b6b in ber_get_next (sb=0x806a000, len=0xbfbfcb70, ber=0x8067180)
>>     at ../../../libraries/liblber/io.c:525
>> #4  0x804ca47 in try_read1msg (ld=0x806a000, msgid=-1, all=1, sb=0x806a000,
>>     lc=0x8061100, result=0xbfbfcc34)
>>     at ../../../libraries/libldap/result.c:296
>> #5  0x804c890 in wait4msg (ld=0x806a000, msgid=-1, all=1, timeout=0x0,
>>     result=0xbfbfcc34) at ../../../libraries/libldap/result.c:208
>> #6  0x804c714 in ldap_result (ld=0x806a000, msgid=-1, all=1, timeout=0x0,
>>     result=0xbfbfcc34) at ../../../libraries/libldap/result.c:137
>> #7  0x804aef7 in dosearch (ld=0x806a000,
>>     base=0x8061040 "o=University of Michigan, c=US", scope=2, attrs=0x0,
>>     attrsonly=0, filtpatt=0x0, value=0x8062050 "sn=jensen")
>>     at ../../../clients/tools/ldapsearch.c:540
>> #8  0x804acfb in main (argc=11, argv=0xbfbfd4e8)
>>     at ../../../clients/tools/ldapsearch.c:471
>> #9  0x804a095 in _start ()
>> At 05:42 AM 2/8/00 GMT, andrew.hacking@cai.com wrote:
>> >Full_Name: Andrew Hacking
>> >Version: -devel
>> >OS: Linux
>> >URL: ftp://ftp.openldap.org/incoming/andrew-hacking-000208-2.patch
>> >Submission from: (NULL) (
>> >
>> >
>> >Sorry for the re-post, my email address was changed today...
>> >
>> >libldap wait4msg() hangs under various conditions when using SSL/TLS.
>> >
>> >The specifics are that wait4msg blocks in do_ldap_select()
>> >while waiting for data that has _already_ been buffered by the
>> >OpenSSL library. do_ldap_select() should never be called until
>> >the SSL/TLS buffers have been exhausted.
>> >
>> >The patch changes to wait4msg() changes the way messages are read to ensure the
>> >SSL/TLS buffers really are exhausted before calling do_ldap_select()
>> >
>> >try_read1msg() is also changed to return 'exhausted' (-3) on an
>> >EWOULDBLOCK/EAGAIN condition.  This differentiates EWOULDBLOCK/EAGAIN
>> >(ie the underlying sockbuf is now exhausted) with the 'keep looking' (-2)
>> >condition.
>> >
>> >
>> >
>> >
>> >