[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd hanging on startup, apparently in BDB mutex lock?

Greg Earle wrote:
Hi folks,

Hi Greg

In 25 years of using UNIX I've seen my share of baffling problems but this is right up there with the best of them.

We run slapd (2.1.22 - they are under Configuration Management
freezes for Operations) on 2 Solaris 9 systems, master & replicant,
and things had been running just fine.  After experiencing some
NFS issues with our NetApp, we made the decision to copy the file
systems that the LDAP servers (also our NIS servers) depend on
to local disk on those two systems.

The BerkeleyDB docs specifically state that BDB only works with local filesystems...

A few days ago my officemate noticed the replicant "slapd"
wasn't accepting connections.  I looked and it showed the ldap
port 389 as being in BOUND state, not LISTEN.

The master was still LISTENing but queries to it would return
results and then hang.  "lsof" of that server showed that these
connections ended up being stuck in CLOSE_WAIT state.

At some point both machines were rebooted, and now each one has
the same problem - "slapd" isn't coming up all the way.  It
gets as far (running in debug -1 mode) as this:

You didn't allow slapd to shutdown cleanly and clean up its BerkeleyDB environment. BerkeleyDB locks are persisted in the filesystem. You need to run db_recover to clean these up before starting up slapd next.

Your OpenLDAP release is ancient, and your BerkeleyDB release is ancient. There are many many known bugs in both. In the current OpenLDAP releases we detect unclean shutdowns and recover automatically, among other things.

  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP     http://www.openldap.org/project/