[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6920) OpenLDAP 2.4.25 / Berkeley DB 4.8.30 Solaris 10 x86 slapd hangs



At Thu, 28 Apr 2011 18:05:43 GMT,
mslby@deshaw.com wrote:
> My company uses OpenLDAP 2.4.25 with Berkeley DB 4.8.30 compiled on Solaris
> 10 x86 using Sun Studio. OpenLDAP is used as the backend for generic naming
> services (passwd, group, netgroup etc?) as well as holding mail routing and
> some custom data. We have master and slave servers and are using syncrepl
> refresh and persist.

Are you using Solaris nss_ldap and ldap_cachemgr(1M) on your
Solaris 10 with OpenLDAP slapd? If so, Solaris libldap.so breaks
your slapd process as the following scenario:

  (1) slapd calls some name service functions, e.g. getpwnam(3C).
  (2) Solaris nss_ldap (/usr/lib/nss_ldap.so.1) is loaded.
  (3) Solaris libldap.so (/usr/lib/libldap.so.5) is loaded.
  (4) Solaris libldap.so overrides ldap_*() and ber_*() functions
      in slapd (or OpenLDAP libldap_r.so and liblber.so).
  (5) OpenLDAP calls ldap_*() and ber_*() functions that are
      belong to Solaris libldap.so, not OpenLDAP's one.

See also:

  http://www.openldap.org/lists/openldap-technical/200902/msg00000.html
  https://bugzilla.mozilla.org/show_bug.cgi?id=292127
  http://bugzilla.padl.com/show_bug.cgi?id=203

-- 
-- Name: SATOH Fumiyasu (fumiyas @ osstech co jp)
-- Business Home: http://www.OSSTech.co.jp/
-- Personal Home: http://www.SFO.jp/blog/

> Lately we have been experiencing hangs with slapd and I cannot figure out
> what the cause is. Things will be humming along and then slapd will simply
> stop accepting connections and answering any requests for reads/writes. We
> have set the loglevel to 256 and there is nothing in the logs that
> indicates what the issue it. At the time that the process goes catatonic all
> syslogs from slapd stop
> 
> I have not found a way to reproduce this on demand.
> 
> Today we have had three hangs and the truss ouput is exactly the same on
> all process. Once the slapd process gets in this state a simple kill does
> not work. The syslog always says that slapd is waiting for tasks to
> complete but this never happens. I need to kill -9 the pid.
> 
> I am going to include all of the debug info that I have collected and
> hopefully someone will have some idea what is going on. I also have a gcore
> of the process if anyone wants and info from that
> 
> All and any help is greatly appreciated