[Date Prev][Date Next] [Chronological] [Thread] [Top]

Slapd dies unexpectedly



We have our system setup with a db3 backend using ldap authentication
for courier IMAP/pop, postfix delivery, and user account
authentication. There are some 4,000 users and things were going
swell, until one day when slapd started not responding, this of course
caused mail to bounce and people to fail to login. The only way to
resolve this was to kill -USR1 slapd and then start it again. We
continued to do this while we decided that maybe our .db files were
corrupted, so late at night we stopped everything, did an dump of the
ldiff and then re-added everything. 

Everything went fine for a day or so, then problems started showing up
again, slapd simply dies on its own. It seemed to die if there were
more than a certain number of connections al at once. I looked for
solutions, I installed nscd and tweaked my slapd.conf to have
"idletimeout 20" and "threads 64" which seemed ok, until the next
morning when even root could not login to the console (root is in the
passwd file), it would timeout. A cntrl-alt-delete later I decided to
remove nscd and put the threads back to the default 32. slapd died
over ten times yesterday (restarting it seemed to work). I put the
threads back to 64 and have only had to restart slapd two times in the
last twelve hours (thats a big improvement)

I've been crawling the net, reading archives and google search results
for the solution, and I can't find what I should do, is my data
corrupt in some way? If so, how can I verify the integrity, rebuild
the store, or remove entries that are bad? Should I be tuning slapd in
someway that I am not aware of? How have others tuned slapd to deal
with growth and this many connections? Ack!

I'm using debian woody (stable), debian packaged slapd version
2.0.23-6.3. 

Thanks for any advice you can give, 
micah