[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Call For Assistance #4 - slapd won't die gracefully, multiple versions.

conn=0 fd=12 closed
^Cslapd shutdown: waiting for 0 threads to terminate

I just spent a little time on a similar problem with a NetBSD 2.0 beta, an though I didn't get to a fix, here's what I found in case anything is interesting.

NetBSD 2.0 has a thread model (kernel activations?) that is probably
different than FreeBSD 5.1's, but they do have something in common -
no pthread_mutexattr_setpshared function. (Nor pthread_condattr_setpshared.)
That means that Berkeley DB 4.2.52 won't use the pthread mutex support, and
instead configures gcc/assembler roll-your-own mutexes. (That also happens
on Linux, probably all distributions.)

In my case, slapd stopped in __db_tas_mutex_lock, while trying to sync
some db file as part of the close.  I found this out by inserting debug
printfs.  gdb reported that it was in sched_yield(), a system call, but
didn't have a useful call traceback.

At this point, I got sidetracked when I wondered if db could be forced
to build with posix mutexes. (It will build, but restricts functionality.)
In the process, I happened to run db_recover on the database.

Now shutdown runs to completion, even though I've issued a few queries.
So I'm not going to close in on what exactly hangs up __db_tas_mutex_lock,
because the problem is no longer reproducible. I pass the baton to you.

	Donn Cave, donn@u.washington.edu

On Friday, April 16, 2004, at 09:38 PM, Jason Lixfeld wrote:
My machine is AMD64. I'm running 5.2.1-RELEASE-p1. I've tried 2.1.29, 2.1.30, 2.2.7, 2.2.8 and 2.2.10 all from FreeBSD ports. no special make options, just plain make. No modifications to the config files, all plain vanilla out-of-the-box configs. I've tried with BDB and LDBM. Same issue with both types of databases. All openldap server versions i have tried exhibit the same problem (this output is from 2.1.29. Output is identicle on all versions, with the exception of the Berkeley DB version in the -d 256 output):

If I start slapd and kill it without issuing a transaction to the server, slapd will die gracefully, no problem:

su-2.05b# /etc/rc.d/slapd start
ps: kvm_getprocs: No such process
Starting slapd.
su-2.05b# ps xa|grep slap
92971 ?? Ss 0:00.01 /usr/local/libexec/slapd -h ldapi://%2fvar%2frun%2fopenldap%2fldapi/ ldap:// -u ldap -g ldap
su-2.05b# /etc/rc.d/slapd stop
Stopping slapd.
Waiting for PIDS: 92971.
su-2.05b# ps xa|grep slapd

If I start slapd and issue a transaction to the server, slapd will NOT die gracefully. I need to kill -9 it which will do bad things to the database. kill -INT will not work either:

su-2.05b# /usr/local/libexec/slapd -d 256
bdb_initialize: Sleepycat Software: Berkeley DB 4.1.25: (December 19, 2002)
bdb_db_init: Initializing BDB database
slapd starting

conn=0 fd=12 ACCEPT from IP= (IP= conn=0 op=0 BIND dn="" method=128 conn=0 op=0 RESULT tag=97 err=0 text= conn=0 op=1 SRCH base="" scope=0 filter="(objectClass=*)" conn=0 op=1 SRCH attr=namingContexts conn=0 op=1 RESULT tag=101 err=0 text= conn=0 op=2 UNBIND conn=0 fd=12 closed ^Cslapd shutdown: waiting for 0 threads to terminate ^C^C^C^C^C

The transaction I performed was the one from the Quickstart guide here: http://www.openldap.org/doc/admin22/quickstart.html

su-2.05b# ldapsearch -x -b '' -s base '(objectclass=*)' namingContexts
# extended LDIF
# LDAPv3
# base <> with scope base
# filter: (objectclass=*)
# requesting: namingContexts

namingContexts: dc=my-domain,dc=com

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

I've been struggling with this since my first post regarding this issue on March 7th and I still haven't figured it out. I'm asking anyone who may have some experience with OpenLDAP to PLEASE help me sort this out. This has really got me by the balls...