[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Testing for replication failures





--On Thursday, May 20, 2004 10:43 AM -0400 Frank Swasey <Frank.Swasey@uvm.edu> wrote:

Look in slurpd.status

Specifically comparing the time stamps you will find there to the newest timestamp listed in the slurpd.replica file in the same directory. That's what I'm doing (every 10 minutes).

That only works if slurpd itself hasn't locked up. If slurpd locks up (and I've had this happen), then the status stay the same, and the slurpd replog file stays the same.


My solution was to monitor the size of the slapd replog and the slurpd replog. If either gets particularly large (for some value of large you want to define), then I send out an alert. I've found that to be quite effective.

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/TSS/Computing Systems
ITSS/TSS/Infrastructure Operations
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html