[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5171) hdb txn_checkpoint failures

> It's still rather suspicious that slave4 and slave6 both had identical log 
> status for base1 (1/188113) but different requested locations (1/8730339 vs
> 1/8730401). If they're identically configured slaves then they ought to be in 
> lock-step. Then again, obviously they're not identical since slave6 doesn't 
> show base4 in your log.

Identical is relative. They've got the same OpenLDAP and supporting 
binaries running on the same patches of Solaris 9 running identical 
turn-up scripts with identical configuration files. But this is 
production, so we've got data changes over time. For instance, the slaves 
bootstrap with a slapadd -q, and the underlying slapcat could easily be 
different from slave4 vs. slave6 (the most recent one is automatically 
used). I'd imagine this would look different at the db layer, even once 
syncrepl eventually converged the logical data?

> Do you have the db_stat output from an uncorrupted slave? What about the 
> master?

Sure... https://www.nbcs.rutgers.edu/~richton/its5171_dbstatl2