Subject: Re: openldap health monitoring
--On Monday, March 22, 2004 11:58 AM +0100 firstname.lastname@example.org
> I need a script/program for monitoring the health of my slapd. From
> what I have seen so far, following problems are likely to occur, and
> must be monitored:
> - slapd dies
> - slapd blocks (and eats 95% of CPU time too)
> - slapd grows too large
> I suppose there are other problems, but this is what I have seen so
> far. I was about to write a simple health-monitoring script that
> checks for these problems, and then I realised that I don't know how
> to differ between a program that's legitimely using lots of resources
> and will stop doing so in a moment from the one that is simply hanging.
> I presume I'm not the only one who needs some kind of health
> monitoring - could someone point me to other problems that are likely
> to occur, and even better to some progs/scripts that monitor the slapd
> health and do
I'll note that all of your 3 problems were caused by you using MIT
Kerberos instead of Heimdal. Stanford has been using OpenLDAP in
production for over a year now, and we've not encountered any of your 3
problems when OpenLDAP was built correctly.
That aside, we do the following:
Have a remote process that queries the server (from Nagios), that pages
us if slapd stops responding.
Have a local process that checks the resident and virtual sizes of
slapd, and page us if they are exceed our boundries.
Have a local process that checks every 5 minutes to see if slapd is
running. It will page the sys admins and restart slapd if it is not
Thankfully, none of these ever gets used, but it makes management happy.
Principal Software Developer
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html