Issue 9400 - Proxy bind retry fails after remote server disconnects
Summary: Proxy bind retry fails after remote server disconnects
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: backends (show other issues)
Version: 2.4.33
Hardware: All All
: --- normal
Target Milestone: 2.4.57
Assignee: OpenLDAP project
URL:
Keywords:
: 8044 (view as issue list)
Depends on:
Blocks:
 
Reported: 2020-11-20 11:18 UTC by tero.saarni
Modified: 2021-02-22 18:02 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description tero.saarni 2020-11-20 11:18:34 UTC
Problem description
-------------------

I'm using slapd-ldap to proxy for a remote LDAP server.  LDAP backend is configured to:

- allow user binds that are passed directly to the remote LDAP server
- allow local user binds that are mapped to remote bind using idassert-bind

The problem happens when remote LDAP server abruptly disconnects the 
(idle) LDAP connection.  For example, next search operation will fail with error:

  Server is unavailable (52)
  Additional information: misconfigured URI?

The operation will succeed when repeating it for second time.


Reproducing the problem
-----------------------

I created a test case that reproduces the problem
- https://git.openldap.org/tsaarni/openldap/-/compare/master...ldap-back-retry-fail


Preliminary troubleshooting
---------------------------

While troubleshooting this I observed following:

(A) The problem is related to retry after remote server abruptly dropped the LDAP connection.

Call chain ldap_back_retry() -> ldap_back_dobind_int() -> ldap_back_is_proxy_authz()
ends up in this branch:

  if ( !( li->li_idassert_flags & LDAP_BACK_AUTH_OVERRIDE )) {
    if ( op->o_tag == LDAP_REQ_BIND ) {
      if ( !BER_BVISEMPTY( &ndn )) {
        dobind = 0;
        goto done;
    }

where "dobind = 0" causes "binddn" and "bindcred" return variables NOT to be filled.
Then in ldap_back_dobind_int() we fall into this branch:

  if ( LDAP_BACK_CONN_ISIDASSERT( lc ) ) {
    if ( BER_BVISEMPTY( &binddn ) && BER_BVISEMPTY( &bindcred ) ) {
      /* if we got here, it shouldn't return result */
      rc = ldap_back_is_proxy_authz( op, rs,
      LDAP_BACK_DONTSEND, &binddn, &bindcred );
      if ( rc != 1 ) {
      Debug( LDAP_DEBUG_ANY, "Error: ldap_back_is_proxy_authz "
        "returned %d, misconfigured URI?\n", rc );
        rs->sr_err = LDAP_OTHER;
        rs->sr_text = "misconfigured URI?";
        LDAP_BACK_CONN_ISBOUND_CLEAR( lc );
        if ( sendok & LDAP_BACK_SENDERR ) {
          send_ldap_result( op, rs );
        }
        goto done;
      }
    }

(B) The problem does NOT occur if configuring separate instances of back-ldap:

- one backend for users: BIND is done with users own credentials - no idassert
- second backend for local admin: local admin BIND is overwritten with idassert-bind


Possibly the same problem have been discussed also earlier, for example
- https://www.openldap.org/lists/openldap-technical/201307/msg00070.html
- https://www.openldap.com/lists/openldap-bugs/201511/msg00041.html
- https://www.openldap.org/lists/openldap-bugs/201905/msg00001.html
Comment 1 Howard Chu 2020-11-23 05:19:43 UTC
(In reply to tero.saarni from comment #0)
> Problem description
> -------------------

Thanks for the detailed report, fixed now in git master.

Will probably move your test into the regression tests, not the main set.
> 
> I'm using slapd-ldap to proxy for a remote LDAP server.  LDAP backend is
> configured to:
> 
> - allow user binds that are passed directly to the remote LDAP server
> - allow local user binds that are mapped to remote bind using idassert-bind
> 
> The problem happens when remote LDAP server abruptly disconnects the 
> (idle) LDAP connection.  For example, next search operation will fail with
> error:
> 
>   Server is unavailable (52)
>   Additional information: misconfigured URI?
> 
> The operation will succeed when repeating it for second time.
> 
> 
> Reproducing the problem
> -----------------------
> 
> I created a test case that reproduces the problem
> -
> https://git.openldap.org/tsaarni/openldap/-/compare/master...ldap-back-retry-
> fail

> Possibly the same problem have been discussed also earlier, for example
> - https://www.openldap.org/lists/openldap-technical/201307/msg00070.html
> - https://www.openldap.com/lists/openldap-bugs/201511/msg00041.html
> - https://www.openldap.org/lists/openldap-bugs/201905/msg00001.html
Comment 2 tero.saarni 2020-11-23 08:27:50 UTC
Thank you for the quick fix!   

It fixed the problem I was seeing, but I see unfortunately test029-ldapglue
started failing.  Looking at Wireshark capture, it seems like there might now
be an extra bind <ROOT> towards the *second* slapd-back:


bindRequest(1) "uid=bjorn,ou=People,dc=example,dc=com" simple 
bindResponse(1) success 
bindRequest(1) "<ROOT>" simple 
bindResponse(1) success 
searchRequest(2) "ou=People,dc=example,dc=com" wholeSubtree 
searchResDone(2) success  [0 results]


Comparing to before the fix, following package capture produced:

bindRequest(1) "uid=bjorn,ou=People,dc=example,dc=com" simple 
bindResponse(1) success 
searchRequest(2) "ou=People,dc=example,dc=com" wholeSubtree 
searchResEntry(2) "ou=People,dc=example,dc=com" 
searchResEntry(2) "uid=bjorn,ou=People,dc=example,dc=com" 
searchResEntry(2) "uid=bjensen,ou=People,dc=example,dc=com" 
searchResEntry(2) "uid=proxy,ou=People,dc=example,dc=com" 



In case you would like to use the test case I created, I will write the copyright
notice here according to the contributor guidelines:

Ericsson Software Technology AB hereby place the following modifications to OpenLDAP Software (and only these modifications) into the public domain. Hence, these modifications may be freely used and/or redistributed for any purpose with or without attribution and/or other notice.


I created another version, which is moved to the regression test suite directory
https://git.openldap.org/tsaarni/openldap/-/commit/7b908d1eb93e8fc09248ae3239c34414b8258cac
Comment 3 Howard Chu 2020-11-24 18:37:52 UTC
(In reply to tero.saarni from comment #2)
> Thank you for the quick fix!   
> 
> It fixed the problem I was seeing, but I see unfortunately test029-ldapglue
> started failing.  Looking at Wireshark capture, it seems like there might now
> be an extra bind <ROOT> towards the *second* slapd-back:

Yes, thanks. This is also now fixed.

> In case you would like to use the test case I created, I will write the
> copyright
> notice here according to the contributor guidelines:
> 
> Ericsson Software Technology AB hereby place the following modifications to
> OpenLDAP Software (and only these modifications) into the public domain.
> Hence, these modifications may be freely used and/or redistributed for any
> purpose with or without attribution and/or other notice.
> 
> 
> I created another version, which is moved to the regression test suite
> directory
> https://git.openldap.org/tsaarni/openldap/-/commit/
> 7b908d1eb93e8fc09248ae3239c34414b8258cac

Great, thank you.
Comment 4 tero.saarni 2020-11-26 07:24:53 UTC
Works now for me as well, thank you!
Comment 5 Quanah Gibson-Mount 2020-12-02 21:52:09 UTC
trunk:

Commits: 
  • 1ea12260 
by Howard Chu at 2020-11-23T05:14:30+00:00 
ITS#9400 back-ldap: fix retry binds


Commits: 
  • 12523b0f 
by Howard Chu at 2020-11-24T16:08:29+00:00 
ITS#9400 back-ldap: fix prev commit

Commits: 
  • a3abd127 
by Tero Saarni at 2020-11-24T18:47:07+00:00 
ITS#9400 Added test case for back-ldap retry failure


RE24:

  • 0dd812ec 
by Howard Chu at 2020-12-02T21:28:07+00:00 
ITS#9400 back-ldap: fix retry binds


  • d8e50d13 
by Howard Chu at 2020-12-02T21:28:17+00:00 
ITS#9400 back-ldap: fix prev commit
Comment 6 Quanah Gibson-Mount 2020-12-02 23:20:49 UTC
Trunk:

Commits: 
  • fd3b8dde 
by Quanah Gibson-Mount at 2020-12-02T23:16:36+00:00 
ITS#9400 - Fix prev commit for modular builds
Comment 7 Quanah Gibson-Mount 2021-02-22 18:01:58 UTC
*** Issue 7832 has been marked as a duplicate of this issue. ***
Comment 8 Quanah Gibson-Mount 2021-02-22 18:02:37 UTC
*** Issue 8044 has been marked as a duplicate of this issue. ***