[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Antwort: Re: openldap health monitoring [Virus checked]

To: denis.havlik@t-mobile.at, openldap-software@OpenLDAP.org
Subject: Re: Antwort: Re: openldap health monitoring [Virus checked]
From: Quanah Gibson-Mount <quanah@stanford.edu>
Date: Mon, 22 Mar 2004 08:25:20 -0800
Content-disposition: inline
In-reply-to: <OF7630956F.E4068DE6-ONC1256E5F.0057F4FE-C1256E5F.0059979A@t-mobile.at>
References: <OF7630956F.E4068DE6-ONC1256E5F.0057F4FE-C1256E5F.0059979A@t-mob ile.at>

--On Monday, March 22, 2004 5:18 PM +0100 denis.havlik@t-mobile.at wrote:

I'll note that all of your 3 problems were caused by you using MIT
Kerberos  instead of Heimdal.  Stanford has been using OpenLDAP in
production for


I presume you are right. However:

1) Buchan has built the sasl & co against Heimdal for me, but I'm still
having problems. It's better now, but I can still kill the server if I
try hard enough. Probably good enough for the "real life", but...
(Possibly these packs aren't well done, or something is still missing,
and this will solve itself on a few days.)

Impressive. ;) I've not been able to kill slapd even with 30+ systems querying non-stop for over 24 hours. Although I am using Solaris, so it may be an OS issue.

Have a remote process that queries the server (from Nagios), that pages
us  if slapd stops responding.


Nagios?

This was the one that actually bothered me, because I wasn't sure how to
assure that query returns (either result or failure) in reasonable time.
In the meantime, I looked at perl::Net::LDAP, and saw that one can define
the timeout there. Good enough 4 me.

You can set timeouts in Nagios too. Most, if not all, the plugins have a -t parameter that specifies the timeout after which to mark a failure. The default is 10 seconds. One solution is to enable the monitor backend, and have it query data out of that.

Any other stuff worth watching, i.e. likely to go wrong?

We monitor the size of slurpd, if slurpd is creating reject files, and if slurpd is running, on the master. We also monitor the size of slapd's replog file (if it grows too large, it means slurpd has locked up), and we monitor the size of slurpd's replog file (if it grows too large, it means some replica is not responding to changes).

The downside on the slurpd monitoring is that it creates reject files way too often. The rest of the monitoring bits are good though.

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/TSS/Computing Systems
ITSS/TSS/Infrastructure Operations
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

References:
- Antwort: Re: openldap health monitoring [Virus checked]
  - From: denis.havlik@t-mobile.at

Prev by Date: OpenLDAP access privileges
Next by Date: Antwort: Re: openldap health monitoring [Virus checked]
Index(es):
- Chronological
- Thread