Issue 6673 - ldap_unbind() hangs on unreachable LDAP server when using TLS
Summary: ldap_unbind() hangs on unreachable LDAP server when using TLS
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.23
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-12 21:18 UTC by arthur@arthurdejong.org
Modified: 2014-08-01 21:04 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description arthur@arthurdejong.org 2010-10-12 21:18:27 UTC
Full_Name: Arthur de Jong
Version: 2.4.23
OS: Debian/unstable
URL: 
Submission from: (NULL) (2001:888:1613:0:218:8bff:fe55:2c9f)


When a network connection to the LDAP server fails (is severed with iptables in
my set-up) the error is detected properly with ldap_result()
(LDAP_OPT_TIMELIMIT, LDAP_OPT_TIMEOUT and LDAP_OPT_NETWORK_TIMEOUT options are
set), however closing the connection with ldap_unbind() hangs indefinitely on a
read() operation (determined by strace) when using a ldaps:// connection.

If the connection is opened without TLS ldap_unbind() only writes some data on
the connection and then closes it but with TLS it expects some response back.
Since read() is used this blocks.

This is a backtrace of the thread that was hanging:

#0  0xb7fe2424 in __kernel_vsyscall ()
#1  0xb7f5809b in read () at ../sysdeps/unix/syscall-template.S:82
#2  0xb7e021bb in sb_stream_read (sbiod=0x8064630, buf=0x8083038, len=5)
    at /build/buildd-openldap_2.4.23-6-i386-fyqRGs/openldap-2.4.23/libraries/liblber/sockbuf.c:493
#3  0xb7e0132a in sb_debug_read (sbiod=0x807e250, buf=0x8083038, len=5)
    at /build/buildd-openldap_2.4.23-6-i386-fyqRGs/openldap-2.4.23/libraries/liblber/sockbuf.c:827
#4  0xb7fa0d27 in tlsg_recv (ptr=0x806fb50, buf=0x8083038, len=4294966784) at
tls_g.c:904
#5  0xb7c65aa2 in ?? () from /usr/lib/libgnutls.so.26
#6  0xb7c65f77 in ?? () from /usr/lib/libgnutls.so.26
#7  0xb7c611e7 in _gnutls_recv_int () from /usr/lib/libgnutls.so.26
#8  0xb7c626b8 in gnutls_bye () from /usr/lib/libgnutls.so.26
#9  0xb7fa0dc4 in tlsg_sb_close (sbiod=0x807e268) at tls_g.c:970
#10 0xb7e01831 in ber_int_sb_close (sb=0x80645a8)
    at /build/buildd-openldap_2.4.23-6-i386-fyqRGs/openldap-2.4.23/libraries/liblber/sockbuf.c:383
#11 0xb7f8d1a0 in ldap_free_connection (ld=0x8066418, lc=0x8065b60, force=1,
unbind=1) at request.c:768
#12 0xb7f83060 in ldap_ld_free (ld=0x8066418, close=1, sctrls=0x0, cctrls=0x0)
at unbind.c:93
#13 0xb7f83326 in ldap_unbind_ext (ld=0x8066418, sctrls=0x0, cctrls=0x0) at
unbind.c:52
#14 0xb7f8343f in ldap_unbind (ld=0x8066418) at unbind.c:69
#15 0x0804cfc0 in ?? ()
#16 0x0804e750 in ?? ()
#17 0x08057a9f in ?? ()
#18 0x0804bb8d in ?? ()
#19 0xb7f50955 in start_thread (arg=0xb7b92b70) at pthread_create.c:300
#20 0xb7ed0e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130


More background information is available at [0].

As a workaround [1] I've used LDAP_OPT_DESC to get the file descriptor for the
connection and used setsockopt(SO_RCVTIMEO) and setsockopt(SO_SNDTIMEO) on it.
This fixes the problem for me at least but I'm not 100% this has no unwanted
side-effects.

[0] http://bugs.debian.org/596983
[1] http://arthurdejong.org/viewvc/nss-pam-ldapd?view=rev&revision=1264
Comment 1 Howard Chu 2010-10-13 08:05:30 UTC
arthur@arthurdejong.org wrote:
> Full_Name: Arthur de Jong
> Version: 2.4.23
> OS: Debian/unstable
> URL:
> Submission from: (NULL) (2001:888:1613:0:218:8bff:fe55:2c9f)
>
>
> When a network connection to the LDAP server fails (is severed with iptables in
> my set-up) the error is detected properly with ldap_result()
> (LDAP_OPT_TIMELIMIT, LDAP_OPT_TIMEOUT and LDAP_OPT_NETWORK_TIMEOUT options are
> set), however closing the connection with ldap_unbind() hangs indefinitely on a
> read() operation (determined by strace) when using a ldaps:// connection.
>
> If the connection is opened without TLS ldap_unbind() only writes some data on
> the connection and then closes it but with TLS it expects some response back.
> Since read() is used this blocks.

Looks like this is a GnuTLS issue. Have you duplicated this with OpenSSL?
>
> This is a backtrace of the thread that was hanging:
>
> #0  0xb7fe2424 in __kernel_vsyscall ()
> #1  0xb7f5809b in read () at ../sysdeps/unix/syscall-template.S:82
> #2  0xb7e021bb in sb_stream_read (sbiod=0x8064630, buf=0x8083038, len=5)
>      at /build/buildd-openldap_2.4.23-6-i386-fyqRGs/openldap-2.4.23/libraries/liblber/sockbuf.c:493
> #3  0xb7e0132a in sb_debug_read (sbiod=0x807e250, buf=0x8083038, len=5)
>      at /build/buildd-openldap_2.4.23-6-i386-fyqRGs/openldap-2.4.23/libraries/liblber/sockbuf.c:827
> #4  0xb7fa0d27 in tlsg_recv (ptr=0x806fb50, buf=0x8083038, len=4294966784) at
> tls_g.c:904
> #5  0xb7c65aa2 in ?? () from /usr/lib/libgnutls.so.26
> #6  0xb7c65f77 in ?? () from /usr/lib/libgnutls.so.26
> #7  0xb7c611e7 in _gnutls_recv_int () from /usr/lib/libgnutls.so.26
> #8  0xb7c626b8 in gnutls_bye () from /usr/lib/libgnutls.so.26
> #9  0xb7fa0dc4 in tlsg_sb_close (sbiod=0x807e268) at tls_g.c:970
> #10 0xb7e01831 in ber_int_sb_close (sb=0x80645a8)
>      at /build/buildd-openldap_2.4.23-6-i386-fyqRGs/openldap-2.4.23/libraries/liblber/sockbuf.c:383
> #11 0xb7f8d1a0 in ldap_free_connection (ld=0x8066418, lc=0x8065b60, force=1,
> unbind=1) at request.c:768
> #12 0xb7f83060 in ldap_ld_free (ld=0x8066418, close=1, sctrls=0x0, cctrls=0x0)
> at unbind.c:93
> #13 0xb7f83326 in ldap_unbind_ext (ld=0x8066418, sctrls=0x0, cctrls=0x0) at
> unbind.c:52
> #14 0xb7f8343f in ldap_unbind (ld=0x8066418) at unbind.c:69
> #15 0x0804cfc0 in ?? ()
> #16 0x0804e750 in ?? ()
> #17 0x08057a9f in ?? ()
> #18 0x0804bb8d in ?? ()
> #19 0xb7f50955 in start_thread (arg=0xb7b92b70) at pthread_create.c:300
> #20 0xb7ed0e7e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
>
>
> More background information is available at [0].
>
> As a workaround [1] I've used LDAP_OPT_DESC to get the file descriptor for the
> connection and used setsockopt(SO_RCVTIMEO) and setsockopt(SO_SNDTIMEO) on it.
> This fixes the problem for me at least but I'm not 100% this has no unwanted
> side-effects.
>
> [0] http://bugs.debian.org/596983
> [1] http://arthurdejong.org/viewvc/nss-pam-ldapd?view=rev&revision=1264
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 arthur@arthurdejong.org 2010-10-13 20:43:29 UTC
On Wed, 2010-10-13 at 01:05 -0700, Howard Chu wrote:
> arthur@arthurdejong.org wrote:
> > If the connection is opened without TLS ldap_unbind() only writes some data on
> > the connection and then closes it but with TLS it expects some response back.
> > Since read() is used this blocks.
> 
> Looks like this is a GnuTLS issue. Have you duplicated this with OpenSSL?

I can confirm that this only happens if libldap is linked with GnuTLS
and not when it is linked against OpenSSL.

-- 
-- arthur - arthur@arthurdejong.org - http://arthurdejong.org --
Comment 3 Howard Chu 2010-10-13 21:17:22 UTC
Arthur de Jong wrote:
> On Wed, 2010-10-13 at 01:05 -0700, Howard Chu wrote:
>> arthur@arthurdejong.org wrote:
>>> If the connection is opened without TLS ldap_unbind() only writes some data on
>>> the connection and then closes it but with TLS it expects some response back.
>>> Since read() is used this blocks.
>>
>> Looks like this is a GnuTLS issue. Have you duplicated this with OpenSSL?
>
> I can confirm that this only happens if libldap is linked with GnuTLS
> and not when it is linked against OpenSSL.

It seems you can workaround this by changing tls_g.c's invocation of 
gnutls_bye() to use GNUTLS_SHUT_WR instead of GNUTLS_SHUT_RDWR. However, that 
strikes me as fundamentally wrong, since libldap is clearly closing both 
directions when it gets here. I think the bug is in gnutls_bye(), it shouldn't 
be waiting indefinitely when it tries to read the peer's Close alert. I'm not 
sure it should even be trying to read that at all; some peers may never send it.

Note that because you're breaking the connection without warning, TCP doesn't 
know that the connection is gone, so there will be no error detected when 
gnutls attempts to send its own Close alert. In this case, it will probably 
block for 2*MSL before getting any further.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 4 arthur@arthurdejong.org 2010-10-14 18:35:23 UTC
On Wed, 2010-10-13 at 14:17 -0700, Howard Chu wrote:
> It seems you can workaround this by changing tls_g.c's invocation of 
> gnutls_bye() to use GNUTLS_SHUT_WR instead of GNUTLS_SHUT_RDWR. However, that 
> strikes me as fundamentally wrong, since libldap is clearly closing both 
> directions when it gets here. I think the bug is in gnutls_bye(), it shouldn't 
> be waiting indefinitely when it tries to read the peer's Close alert. I'm not 
> sure it should even be trying to read that at all; some peers may never send it.

I can't comment on the GnuTLS API because I haven't used it before. Can
you file a bugreport with GnuTLS? Do you need any more input from my
end?

> Note that because you're breaking the connection without warning, TCP doesn't 
> know that the connection is gone, so there will be no error detected when 
> gnutls attempts to send its own Close alert. In this case, it will probably 
> block for 2*MSL before getting any further.

In my tests I haven't waited that long (I think). Do you know if there
are any problems with using setsockopt(SO_RCVTIMEO) and
setsockopt(SO_SNDTIMEO) on the socket?

-- 
-- arthur - arthur@arthurdejong.org - http://arthurdejong.org --
Comment 5 Howard Chu 2010-10-14 21:42:09 UTC
Arthur de Jong wrote:
> On Wed, 2010-10-13 at 14:17 -0700, Howard Chu wrote:
>> It seems you can workaround this by changing tls_g.c's invocation of
>> gnutls_bye() to use GNUTLS_SHUT_WR instead of GNUTLS_SHUT_RDWR. However, that
>> strikes me as fundamentally wrong, since libldap is clearly closing both
>> directions when it gets here. I think the bug is in gnutls_bye(), it shouldn't
>> be waiting indefinitely when it tries to read the peer's Close alert. I'm not
>> sure it should even be trying to read that at all; some peers may never send it.
>
> I can't comment on the GnuTLS API because I haven't used it before. Can
> you file a bugreport with GnuTLS? Do you need any more input from my
> end?

What is the GnuTLS version number? Seems like you should be submitting this to 
their tracker anyway, as the original reporter.

https://savannah.gnu.org/support/?group=gnutls

>> Note that because you're breaking the connection without warning, TCP doesn't
>> know that the connection is gone, so there will be no error detected when
>> gnutls attempts to send its own Close alert. In this case, it will probably
>> block for 2*MSL before getting any further.
>
> In my tests I haven't waited that long (I think). Do you know if there
> are any problems with using setsockopt(SO_RCVTIMEO) and
> setsockopt(SO_SNDTIMEO) on the socket?
>
I have no idea whether GnuTLS handles asynch I/O properly or not. If you're 
only setting these options immediately before closing, I don't see why it 
should cause any problems. But GnuTLS has enough flaws in general that I 
wouldn't claim it ever does anything right or as expected.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 6 arthur@arthurdejong.org 2010-10-15 07:51:13 UTC
On Thu, 2010-10-14 at 14:42 -0700, Howard Chu wrote:
> What is the GnuTLS version number? Seems like you should be submitting
> this to their tracker anyway, as the original reporter.

I'm using GnuTLS 2.8.6. I've just also tried with GnuTLS 2.10.2 which
shows the same behaviour.

I've reported it to
  https://savannah.gnu.org/support/index.php?107495
feel free to add any more pointers that may be useful.

Anyway, thanks for your help.

-- 
-- arthur - arthur@arthurdejong.org - http://arthurdejong.org --
Comment 7 Howard Chu 2010-10-16 03:17:48 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 8 Howard Chu 2010-10-16 10:11:59 UTC
Arthur de Jong wrote:
> On Thu, 2010-10-14 at 14:42 -0700, Howard Chu wrote:
>> What is the GnuTLS version number? Seems like you should be submitting
>> this to their tracker anyway, as the original reporter.
>
> I'm using GnuTLS 2.8.6. I've just also tried with GnuTLS 2.10.2 which
> shows the same behaviour.
>
> I've reported it to
>    https://savannah.gnu.org/support/index.php?107495
> feel free to add any more pointers that may be useful.
>
> Anyway, thanks for your help.
>
Based on the discussion there, I've committed the workaround I already 
mentioned in my previous reply.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 9 Quanah Gibson-Mount 2010-12-11 18:12:22 UTC
changed notes
changed state Test to Release
Comment 10 Quanah Gibson-Mount 2011-02-14 12:31:09 UTC
changed notes
changed state Release to Closed
Comment 11 OpenLDAP project 2014-08-01 21:04:31 UTC
workaround in HEAD
fixed in RE24