Issue 5363 - ldapsearch hangs when using ldap for host in nsswitch
Summary: ldapsearch hangs when using ldap for host in nsswitch
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: 2.4.7
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-02-10 05:16 UTC by rra@debian.org
Modified: 2014-08-01 21:03 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description rra@debian.org 2008-02-10 05:16:48 UTC
Full_Name: Russ Allbery
Version: 2.4.7
OS: Debian GNU/Linux
URL: 
Submission from: (NULL) (171.64.19.147)


ldapsearch calls getaddrinfo inside a mutex because on many systems it's not
thread-safe.  However, this means that if libnss-ldap is in use and pointing to
LDAP for host information, we get deadlock.  ldapsearch calls getaddrinfo(),
which calls into libnss-ldap, which tries to open a connection, which then hits
the same lock inside the LDAP libraries.

Not sure the best fix for this.  Removing the locks fixes the deadlock and is
safe for glibc systems since glibc's getaddrinfo is thread-safe; however, it
would not be safe on those other systems that prompted the lock in the first
place.

See http://bugs.debian.org/340601 for reference including the patch that appears
to clear the deadlock on Debian that Steve came up with while tracking the
problem down.  I'm guessing that you don't want it in its current form, but it
may be useful for reference.

Comment 1 rra@debian.org 2008-02-10 05:49:32 UTC
rra@stanford.edu writes:

> ldapsearch calls getaddrinfo inside a mutex because on many systems it's
> not thread-safe.  However, this means that if libnss-ldap is in use and
> pointing to LDAP for host information, we get deadlock.  ldapsearch
> calls getaddrinfo(), which calls into libnss-ldap, which tries to open a
> connection, which then hits the same lock inside the LDAP libraries.

Oh, and incidentally, if you're wondering why ldapsearch is using
libldap_r and hence getting this mutex at all, it's because of the issue
that I raised on openldap-technical (which I can also file an ITS about if
it would be appropriate).  We can't ship both libldap and libldap_r and
have some applications linked against one and some against the other
because they expose the same symbols and it's caused problems in the past.

Right now, pending advice, we're taking the easiest possible solution and
just linking everything against libldap_r.

But this particular problem probably wouldn't happen without that.  So one
solution may be to fix *that* problem, and any advice there is most
definitely welcome.  (However, I would expect any slapd backend that
itself did a search to possibly have this same issue, so fixing ldapsearch
may not fix the full problem.)

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

Comment 2 Howard Chu 2008-02-10 06:40:02 UTC
rra@stanford.edu wrote:
> rra@stanford.edu writes:
>> ldapsearch calls getaddrinfo inside a mutex because on many systems it's
>> not thread-safe.  However, this means that if libnss-ldap is in use and
>> pointing to LDAP for host information, we get deadlock.  ldapsearch
>> calls getaddrinfo(), which calls into libnss-ldap, which tries to open a
>> connection, which then hits the same lock inside the LDAP libraries.

> Oh, and incidentally, if you're wondering why ldapsearch is using
> libldap_r and hence getting this mutex at all, it's because of the issue
> that I raised on openldap-technical (which I can also file an ITS about if
> it would be appropriate).  We can't ship both libldap and libldap_r and
> have some applications linked against one and some against the other
> because they expose the same symbols and it's caused problems in the past.

> Right now, pending advice, we're taking the easiest possible solution and
> just linking everything against libldap_r.

libldap_r was written solely for use by slapd and slurpd. Anybody else who 
uses it does so at their own risk. Applications are supposed to link with 
libldap. This has been stated several times before.

Until somebody decides to invest some energy into writing a new C API spec, 
(simply dusting off the expired C API draft won't cut it) there is not and 
will never be any official support for threading/re-entrancy in the LDAP API.

> But this particular problem probably wouldn't happen without that.  So one
> solution may be to fix *that* problem, and any advice there is most
> definitely welcome.  (However, I would expect any slapd backend that
> itself did a search to possibly have this same issue, so fixing ldapsearch
> may not fix the full problem.)

Unfortunately your problems with libnss-ldap are outside our purview, and any 
ITS relating to it will just be closed/rejected. The fact that nss-ldap is 
essentially broken is not for us to fix. The problems you encounter from 
breaking our Makefiles in response to the nss-ldap problems are also obviously 
not for us to fix; you did that to yourself.

Unofficially, I would suggest that you look into nss-ldapd. I haven't used it 
personally, and have no idea what's missing in the current implementation. But 
it at least has a fighting chance of working sanely.

The other obvious step in this situation is never to use LDAP for hostname 
resolution. ("Doctor, it hurts when I do this." - "Don't do that.") LDAP is a 
much heavier weight protocol than DNS; as such it's simply the wrong tool for 
the job. (Note - I like using DNS servers backed by LDAP, but that's a 
completely different situation.)
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/

Comment 3 Howard Chu 2008-02-10 07:06:07 UTC
changed notes
changed state Open to Feedback
Comment 4 rra@debian.org 2008-02-10 17:56:53 UTC
Howard Chu <hyc@symas.com> writes:

> libldap_r was written solely for use by slapd and slurpd. Anybody else
> who uses it does so at their own risk. Applications are supposed to link
> with libldap. This has been stated several times before.

Unfortunately, I still need to figure out how best to fix the problem and
that by itself doesn't give me a way, but I do understand where you're
coming from.

> Unfortunately your problems with libnss-ldap are outside our purview,
> and any ITS relating to it will just be closed/rejected. The fact that
> nss-ldap is essentially broken is not for us to fix. The problems you
> encounter from breaking our Makefiles in response to the nss-ldap
> problems are also obviously not for us to fix; you did that to yourself.

In, I should point out, a testing version as a short-term way of getting
2.4.7 into the archive so that we could throw out 2.1, which I'm sure
you'd agree is a great good.  :)  I was always planning on trying to
replace this solution with something that you felt you could support; we
were just trying to do one thing at a time.

We'll talk this over and see if there's a way that we can either move from
nss-ldap or try to get it out of the places where it causes problems,
although that's going to be really rough going as you can probably
imagine.  libnss-ldap has over 2,800 installations that report popcon
statistics, which means a *lot* of people are using it.

Thinking out-loud here a bit (because I really want to solve this problem
in the long run in a way that you can support), libnss-ldap and slapd on
the same system are instant death unless everything uses libldap_r.  If we
have slapd conflict with libnss-ldap and then have only slapd use
libldap_r (and make people use libnss-ldapd on systems that are also
running slapd), we may solve a fair chunk of the problem.  The other place
where libldap is pulled in is via PAM, but slapd itself should never need
to call into PAM (even indirectly).

If any slapd backends load dynamic libraries that in turn load libldap,
the world may also explode, but this should only affect back-perl, which
isn't working right now anyway due to the libtool RTLD_GLOBAL issues.

> The other obvious step in this situation is never to use LDAP for
> hostname resolution. ("Doctor, it hurts when I do this." - "Don't do
> that.") LDAP is a much heavier weight protocol than DNS; as such it's
> simply the wrong tool for the job. (Note - I like using DNS servers
> backed by LDAP, but that's a completely different situation.)

Okay.  I can at least pass that along to the user.

Thanks for the response.

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

Comment 5 Howard Chu 2008-02-10 18:22:32 UTC
rra@stanford.edu wrote:
> We'll talk this over and see if there's a way that we can either move from
> nss-ldap or try to get it out of the places where it causes problems,
> although that's going to be really rough going as you can probably
> imagine.  libnss-ldap has over 2,800 installations that report popcon
> statistics, which means a *lot* of people are using it.
>
> Thinking out-loud here a bit (because I really want to solve this problem
> in the long run in a way that you can support), libnss-ldap and slapd on
> the same system are instant death unless everything uses libldap_r.  If we
> have slapd conflict with libnss-ldap and then have only slapd use
> libldap_r (and make people use libnss-ldapd on systems that are also
> running slapd), we may solve a fair chunk of the problem.  The other place
> where libldap is pulled in is via PAM, but slapd itself should never need
> to call into PAM (even indirectly).

You might consider suggesting to PADL that they contribute pam/nss_ldap to the 
OpenLDAP Project, if you want us to take an active role in solving this. Until 
then, the discussion just doesn't belong here on the OpenLDAP ITS. I'll note 
that I've described the solution Symas uses in our commercial distribution of 
pam/nss-ldap on the pam/nss-ldap mailing lists in the past. Our package has 
none of these issues.

Hint: since dynamic linker dependencies are causing the conflict, the solution 
is not to rely on the dynamic linker. In this particular case, using the 
dynamic linker introduces a security vulnerability anyway.

> If any slapd backends load dynamic libraries that in turn load libldap,
> the world may also explode, but this should only affect back-perl, which
> isn't working right now anyway due to the libtool RTLD_GLOBAL issues.

>> The other obvious step in this situation is never to use LDAP for
>> hostname resolution. ("Doctor, it hurts when I do this." - "Don't do
>> that.") LDAP is a much heavier weight protocol than DNS; as such it's
>> simply the wrong tool for the job. (Note - I like using DNS servers
>> backed by LDAP, but that's a completely different situation.)
>
> Okay.  I can at least pass that along to the user.

Or patch nss_ldap to simply return a failure whenever it's invoked for 
hostname resolution.
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/

Comment 6 rra@debian.org 2008-02-10 18:58:52 UTC
hyc@symas.com writes:

> You might consider suggesting to PADL that they contribute pam/nss_ldap
> to the OpenLDAP Project, if you want us to take an active role in
> solving this. Until then, the discussion just doesn't belong here on the
> OpenLDAP ITS. I'll note that I've described the solution Symas uses in
> our commercial distribution of pam/nss-ldap on the pam/nss-ldap mailing
> lists in the past. Our package has none of these issues.
>
> Hint: since dynamic linker dependencies are causing the conflict, the
> solution is not to rely on the dynamic linker. In this particular case,
> using the dynamic linker introduces a security vulnerability anyway.

Understood.  Thank you very much for the help that you've provided.  We'll
try to find a solution that won't make you hate us.  :)

-- 
Russ Allbery (rra@stanford.edu)             <http://www.eyrie.org/~eagle/>

Comment 7 Howard Chu 2008-06-03 19:53:05 UTC
changed notes
changed state Feedback to Closed
Comment 8 OpenLDAP project 2014-08-01 21:03:31 UTC
not ours
see contrib/nssov