8963 – Start TLS failed errors on back-ldap binds / short timeout, config option request

Issue 8963 - Start TLS failed errors on back-ldap binds / short timeout, config option request

Summary: Start TLS failed errors on back-ldap binds / short timeout, config option req...

Status:	VERIFIED FIXED

Alias:	None

Product:	OpenLDAP
Classification:	Unclassified
Component:	slapd (show other issues)
Version:	2.4.44
Hardware:	All All

Importance:	--- normal
Target Milestone:	---
Assignee:	OpenLDAP project

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-01-29 10:23 UTC by janne.peltonen@helsinki.fi
Modified:	2019-07-24 19:02 UTC (History)
CC List:	0 users

See Also:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description janne.peltonen@helsinki.fi 2019-01-29 10:23:04 UTC

Full_Name: Janne Peltonen
Version: 2.4.44
OS: Red Hat Enterprise Linux 7
URL: https://www.openldap.org/lists/openldap-technical/201901/msg00038.html
Submission from: (NULL) (2001:708:110:1820:3664:a9ff:fe04:7363)


Proxied BINDs on back-ldap sometimes create Start TLS failed errors on the
connection from proxy to backend, and the client of the proxy receives the error
52. We had a look at the code, and it appears that in back-ldap/bind.c, the
timeout argument of the function ldap_back_start_tls always ends up as zero, so
the timeout tv.sec and tv.usec end up being set as 0 and 100000, respectively,
by the macro LDAP_BACK_TV_SET defined in back-ldap/back-ldap.h. This means that
the actual timeout waiting for starttls operation to complete will be 0.1
seconds, which is sometimes too little if there are latencies. We did a
quick&dirty hack to just set tv.sec to 3 and tv.usec to 0, and were no longer
able to get any Start TLS failed errors on the proxy bind.

We did find that the back-ldap has an "extended" timeout defined in
back-ldap/config.c - except that is commented out as "not implemented". Seems to
us that if that were implemented, you could set the timeout in your ldap
database section by something like "timeout extended=" and then you could have a
longer timeout.

Steps to reproduce:
1. Configure an openldap proxy-backend pair, the proxy using the "ldap" database
and using ldap urls to connect to the backend (using ldaps urls might actually
be a workaround). Configure the binds to be proxied to the backend, too.
2. Create a script that creates multiple - say, 200 - simultaneous connections
to the proxy, using an ldaps url to connect to the proxy, and using the same DN
to bind as. (We haven't tried to connect to the frontend using ldap+starttls,
but I would suppose that would have a similar result, since the client
establishes TLS anyway before trying to bind to the proxy.)
3. Observe some of the binds fail (we typically get 5-6 fails for 200
simultaneous connection), with the client of the proxy getting a "Server is
unavailable: 52" type of error, and the proxy log containing lines such as
"conn=7716 op=0 RESULT tag=97 err=52 text=Start TLS failed" after the BIND from
the client.

Possible workaround: Configure the proxy to use ldaps urls to the backend.

Suggested fix: Implement the "extended" timeout type in back-ldap/config.c
timeout_table, so that people can set the StartTLS timeout to a larger value
than the default 0.1 seconds.

References: discussion thread on the OpenLDAP-Technical mailing list.

Comment 1 Howard Chu 2019-01-31 23:41:07 UTC

janne.peltonen@helsinki.fi wrote:
> Full_Name: Janne Peltonen
> Version: 2.4.44
> OS: Red Hat Enterprise Linux 7
> URL: https://www.openldap.org/lists/openldap-technical/201901/msg00038.html
> Submission from: (NULL) (2001:708:110:1820:3664:a9ff:fe04:7363)
> 
> 
> Proxied BINDs on back-ldap sometimes create Start TLS failed errors on the
> connection from proxy to backend, and the client of the proxy receives the error
> 52. We had a look at the code, and it appears that in back-ldap/bind.c, the
> timeout argument of the function ldap_back_start_tls always ends up as zero, so
> the timeout tv.sec and tv.usec end up being set as 0 and 100000, respectively,
> by the macro LDAP_BACK_TV_SET defined in back-ldap/back-ldap.h. This means that
> the actual timeout waiting for starttls operation to complete will be 0.1
> seconds, which is sometimes too little if there are latencies. We did a
> quick&dirty hack to just set tv.sec to 3 and tv.usec to 0, and were no longer
> able to get any Start TLS failed errors on the proxy bind.
> 
> We did find that the back-ldap has an "extended" timeout defined in
> back-ldap/config.c - except that is commented out as "not implemented". Seems to
> us that if that were implemented, you could set the timeout in your ldap
> database section by something like "timeout extended=" and then you could have a
> longer timeout.

Thanks for the report. Note that if you simply set a "timeout" with no specific
qualifiers, it would have worked for you as well.

At this point, the "extended" timeout was meant for the other exops that back-ldap
supports, e.g. whoami and passwordModify. Since 2.4 is feature frozen we are not
going to implement the extended timeout feature here.

Instead, I've changed the start_tls code to use the "bind" timeout, since it will
always be used as part of a bind operation anyway.
> 
> Steps to reproduce:
> 1. Configure an openldap proxy-backend pair, the proxy using the "ldap" database
> and using ldap urls to connect to the backend (using ldaps urls might actually
> be a workaround). Configure the binds to be proxied to the backend, too.
> 2. Create a script that creates multiple - say, 200 - simultaneous connections
> to the proxy, using an ldaps url to connect to the proxy, and using the same DN
> to bind as. (We haven't tried to connect to the frontend using ldap+starttls,
> but I would suppose that would have a similar result, since the client
> establishes TLS anyway before trying to bind to the proxy.)
> 3. Observe some of the binds fail (we typically get 5-6 fails for 200
> simultaneous connection), with the client of the proxy getting a "Server is
> unavailable: 52" type of error, and the proxy log containing lines such as
> "conn=7716 op=0 RESULT tag=97 err=52 text=Start TLS failed" after the BIND from
> the client.
> 
> Possible workaround: Configure the proxy to use ldaps urls to the backend.
> 
> Suggested fix: Implement the "extended" timeout type in back-ldap/config.c
> timeout_table, so that people can set the StartTLS timeout to a larger value
> than the default 0.1 seconds.
> 
> References: discussion thread on the OpenLDAP-Technical mailing list.
> 
> 


-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Howard Chu 2019-01-31 23:42:59 UTC

changed notes
changed state Open to Test
moved from Incoming to Software Bugs

Comment 3 Quanah Gibson-Mount 2019-01-31 23:48:23 UTC

changed notes
changed state Test to Release

Comment 4 janne.peltonen@helsinki.fi 2019-02-01 08:58:53 UTC

On Thu, Jan 31, 2019 at 11:41:07PM +0000, Howard Chu wrote:
> Thanks for the report. Note that if you simply set a "timeout" with no specific
> qualifiers, it would have worked for you as well.

Oh dear. Of course. And it says so cleary on the man page, too. Thank you!

> At this point, the "extended" timeout was meant for the other exops that back-ldap
> supports, e.g. whoami and passwordModify. Since 2.4 is feature frozen we are not
> going to implement the extended timeout feature here.
> 
> Instead, I've changed the start_tls code to use the "bind" timeout, since it will
> always be used as part of a bind operation anyway.

OK, thanks a lot!


--Janne / Helsinki Uni

Comment 5 OpenLDAP project 2019-07-24 19:02:08 UTC

fixed in master
fixed in RE24 (2.4.48)

Comment 6 Quanah Gibson-Mount 2019-07-24 19:02:08 UTC

changed notes
changed state Release to Closed