[Date Prev][Date Next] [Chronological] [Thread] [Top]

Problem unexpected failing slapd

To: openldap-technical@openldap.org
Subject: Problem unexpected failing slapd
From: Ruud Baart <r.j.baart@prompt.nl>
Date: Sun, 27 Feb 2011 12:57:25 +0100
Organization: IT's Prompt or never ...
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; nl; rv:1.9.2.13) Gecko/20101207 Lightning/1.0b2 Thunderbird/3.1.7

Problem:

For a customer we use LDAP for many years. Last year suddenly the slapdservice just stopped without any traces in the logfiles. After a restartof slapd everything works fine again. But the problem was there: it wasnot an incident, now and then slapd just stops and always without anytraces in the logfiles. Sometime three times a day, sometime a weekwithout a failure. I can't find a pattern or any relation to any otherservice on the linux server.


Environment:

- Several (debian squeeze) servers , several windows servers. We use bdbdatabase backend.- There is one master LDAP server which provides syncprov and tworeplica's LDAP servers (syncrepl). The master server is most intens used(mainly samba as primary domain controller: a few hundred useraccounts,lot of groupaccounts, workstations, acl's, etc.), one of the replica'sis not very busy but handles the mail for all users (lookup: amavis,postfix, courier-imap, mailaccount settings etc). The third replica isnot busy at all, it is a remote location.

- Total LDAP is 3700 dn's, slapcat produces a file of 7,3 Mb.

- It is only the master LDAP with stops suddenly. I have never seen afailure of a replica LDAP.

Because I have no clear idea about the problem I have no idea whichtechnical details are relevant:

DB_CONFIG
===========
set_cachesize 0 10485760 1
set_lk_max_objects 10000
set_lk_max_locks 10000
set_lk_max_lockers 10000
set_lg_dir /home/ldap-dbd

The database is stored on a ext3 filesystem, kernel 2.6.32. The serverhas no problems, plenty of memory and a fast diskarray (SAS->SATA).Never technical problems with this server. And it worked withoutproblems for a long period. Nothing has changed to the environment orthe LDAP setup (except of course with the upgrade to debian squeeze butthe problem was already there).


What we have tried:

- upgrade from openldap 2..4.17 (debian lenny+backports) to openldap2.4.23 (debian squeeze). I saw in the release notes that problemsrelated to syncrepl were solved. Therefor we waited for version 2.4.23te become available in debian. This upgrade made no difference.- reindex, rebuilt the directory. When I rebuilt the LDAP with a cleanLDIF file on the master LDAP or an other machine with ldapadd there isnot one error or warning.


The workaround for the moment:

I have written a process monitor (perl daemon) which monitors the slapddaemon and if it suddenly stops, slapd is restarted. It is of course nota solution but the 300 user can work. If slapd stops without a restartwithin 1 minute a few hundred people can't work because samba stops working.

I would like to receive suggestions what we can do to find the problem.Because there is no pattern, nothing in the logfiles I don't know whereto start.


--

Regards,

Ruud Baart

Follow-Ups:
- Re: Problem unexpected failing slapd
  - From: jekvb <jekvb@gmx.co.uk>
- Re: Problem unexpected failing slapd
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: Help needed with opeLDAP configuration
Next by Date: Re: Problem unexpected failing slapd
Index(es):
- Chronological
- Thread