Re: Bugfix: wait4msg() hangs when using SSL/TLS (ITS#446)


Thanks for the feedback.

I guess this means the socket is in 'blocking' mode, which would always cause a
hang when reading a byte that is not yet available (as I think you have shown).
This is quite serious, since even if a select() is performed to see if _some_ data
is available BEFORE calling try_read1msg(), the stream_read() will hang if not
enough bytes have been queued on the socket when attempting to parse the message.

I always thought/assumed (I know, assumption is never a good thing) that sockbuf's
used non-blocking sockets for this very reason... to avoid hanging indefinitely
when performing a read/recv.  This means that performing a read/recv on a
non-blocking socket when data is unavailable returns failure with errno set to
EWOULDBLOCK/EAGAIN, then select() would be called to wait for the arrival of more
data, the timeout checked, and the read/recv tried again.

So the question is, why are blocking sockets being used, or has it always been this

way and no-one noticed ?
I am a little concerned because libldap and liblber appear to be written around the

notion of non-blocking sockets.

If blocking sockets are used, the timeout that the caller provides to libldap will
never be honoured when say, the server takes some time to respond, or worse it
crashes, it will even hang when network congestion/disruption occurs.  It is under
all these conditions that the timeout ensures libldap only blocks for a definite
amount of time.


Kurt@OpenLDAP.org wrote:

> I've experimented with this patch a bit (without TLS) and found
> that it causes test009-referral to hang.  ldapsearch has data
> pending from the master but is blocked reading from the slave.
> (gdb) where
> #0  0x18183f44 in read () from /usr/lib/libc.so.3
> #1  0x805a3d5 in stream_read (sb=0x806a000, buf=0x806718c, len=1)
>     at ../../../libraries/liblber/sockbuf.c:911
> #2  0x8059df3 in ber_pvt_sb_read (sb=0x806a000, buf_arg=0x806718c, len=1)
>     at ../../../libraries/liblber/sockbuf.c:504
> #3  0x8058b6b in ber_get_next (sb=0x806a000, len=0xbfbfcb70, ber=0x8067180)
>     at ../../../libraries/liblber/io.c:525
> #4  0x804ca47 in try_read1msg (ld=0x806a000, msgid=-1, all=1, sb=0x806a000,
>     lc=0x8061100, result=0xbfbfcc34)
>     at ../../../libraries/libldap/result.c:296
> #5  0x804c890 in wait4msg (ld=0x806a000, msgid=-1, all=1, timeout=0x0,
>     result=0xbfbfcc34) at ../../../libraries/libldap/result.c:208
> #6  0x804c714 in ldap_result (ld=0x806a000, msgid=-1, all=1, timeout=0x0,
>     result=0xbfbfcc34) at ../../../libraries/libldap/result.c:137
> #7  0x804aef7 in dosearch (ld=0x806a000,
>     base=0x8061040 "o=University of Michigan, c=US", scope=2, attrs=0x0,
>     attrsonly=0, filtpatt=0x0, value=0x8062050 "sn=jensen")
>     at ../../../clients/tools/ldapsearch.c:540
> #8  0x804acfb in main (argc=11, argv=0xbfbfd4e8)
>     at ../../../clients/tools/ldapsearch.c:471
> #9  0x804a095 in _start ()