[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5171) hdb txn_checkpoint failures



Aaron Richton wrote:
> With that said, "some admin specifically disturbed the log files around 
> that time." Logs show that I was the only person in a position to do so 
> (unless somebody broke in and covered their tracks; we'll ignore that 
> theoretical possibility). On September 24, I reconfigured the slaves to 
> use a different IP address to the master instead of the existing 
> connection. The times are too coincidental to be unrelated:
> 
> (slave4) reconfigured Sep 24 09:41 (first syslog complaint 09:43)
> (slave6) reconfigured Sep 24 09:39 (first syslog complaint 09:44)
> 
> 
> So...is there something that's cued off the (reverse?) name service 
> entries for the master? Does the master IP hash in to a CSN somehow? And 
> if this is indeed the case/root cause...well, quite honestly, I think that 
> assuming a name service database will remain constant throughout a slapd 
> instance is a fallacy. Furthermore, if this is indeed the case, it should 
> be absolutely trivial for me to reproduce this (I can perform a DR on 
> slave4/6, and reconfigure their network again).

No. The BDB transaction log files don't know (or care) anything about IP 
addresses. Nothing at the slapd layer could have any direct effect on the BDB 
transaction logs. How exactly did you reconfigure the servers, did you stop 
them and restart them or did you use cn=config?
> 
> With that in mind, I'll likely test this reproduction early next week. I 
> can still get db_stat from all slaves (working and not) at this point if 
> that's interesting. Comments?
> 
Might as well get the db_stat -l output for a few of them to compare.

-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/