[Date Prev][Date Next]
Re: Testing for replication failures
-----BEGIN PGP SIGNED MESSAGE-----
| While the list is discussing replication, I'd like to bring up the
| determining when replication has failed.
| Currently, I can only see there being one case I can monitor for: A master
| and slave get out of sync and so the master begins producing an error
| what entries it can't replicate (for example, if the master sends an 'add'
| but the slave already has that entry). I monitor this by examining if the
| error log file mtime has changed, and if so, emailing an error.
| Recently however I found that a mistake had been made the and port 389/tcp
| on the slave had been firewalled. So the replog was growing and no
| replication was taking place. Is it possible for me to detect this type of
| error? I'd like to see a timeout error from slurpd in syslog like
| replicate for X hours." Alternatively, is there a file I can monitor that
| would indicate something is wrong?
| (Yes, monitoring that we can connect to port 389/tcp would solve *this*
| problem, but I'm more concerned with the general case.)
| Basically, I want to be able to easily answer at all times the
| replication up and working properly?"
Look in slurpd.status
Buchan Milne Senior Support Technician
Obsidian Systems http://www.obsidian.co.za
B.Eng RHCE (803004789010797)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----