[Date Prev][Date Next]
Re: Antwort: Re: openldap health monitoring [Virus checked]
--On Monday, March 22, 2004 5:18 PM +0100 email@example.com wrote:
I'll note that all of your 3 problems were caused by you using MIT
Kerberos instead of Heimdal. Stanford has been using OpenLDAP in
I presume you are right. However:
1) Buchan has built the sasl & co against Heimdal for me, but I'm still
having problems. It's better now, but I can still kill the server if I
try hard enough. Probably good enough for the "real life", but...
(Possibly these packs aren't well done, or something is still missing,
and this will solve itself on a few days.)
Impressive. ;) I've not been able to kill slapd even with 30+ systems
querying non-stop for over 24 hours. Although I am using Solaris, so it
may be an OS issue.
Have a remote process that queries the server (from Nagios), that pages
us if slapd stops responding.
This was the one that actually bothered me, because I wasn't sure how to
assure that query returns (either result or failure) in reasonable time.
In the meantime, I looked at perl::Net::LDAP, and saw that one can define
the timeout there. Good enough 4 me.
You can set timeouts in Nagios too. Most, if not all, the plugins have a
-t parameter that specifies the timeout after which to mark a failure. The
default is 10 seconds. One solution is to enable the monitor backend, and
have it query data out of that.
Any other stuff worth watching, i.e. likely to go wrong?
We monitor the size of slurpd, if slurpd is creating reject files, and if
slurpd is running, on the master. We also monitor the size of slapd's
replog file (if it grows too large, it means slurpd has locked up), and we
monitor the size of slurpd's replog file (if it grows too large, it means
some replica is not responding to changes).
The downside on the slurpd monitoring is that it creates reject files way
too often. The rest of the monitoring bits are good though.
Principal Software Developer
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html