[Date Prev][Date Next] [Chronological] [Thread] [Top]

test007-replication failure (ITS#2623)



Full_Name: Dan Riley
Version: HEAD/REL_ENG_2_2
OS: Tru64 4.0f
URL: 
Submission from: (NULL) (128.84.46.236)


replog ordering problems seem to be dramatically worse in HEAD--replog mostly
works in 2.1.22, but in HEAD and REL_ENG_2_2 on Tru64 4.0f I'm seeing failures
of test007-replication apparently due to out-of-order timestamps in the replog:

dsr_lnscu8% fgrep time: slurpd.replog
time: 1056995045
time: 1056995045
time: 1056995045
time: 1056995045
time: 1056995046
time: 1056995045
time: 1056995046
time: 1056995046
time: 1056995047
time: 1056995047
time: 1056995047
time: 1056995047
time: 1056995047
time: 1056995047
time: 1056995048
time: 1056995047
time: 1056995048
time: 1056995047
time: 1056995048
time: 1056995063
time: 1056995063
time: 1056995063
time: 1056995063
time: 1056995063
time: 1056995063
time: 1056995064
time: 1056995064
time: 1056995064

(perhaps from increased concurrency in the backend?), which result in slurpd
incorrectly considering some entries to be old:

Replica localhost:9010, skip repl record for ou=Information Technology Division,
ou=People,o=University of Michigan,c=US (old)
Replica localhost:9010, skip repl record for cn=Alumni Assoc
Staff,ou=Groups,o=University of Michigan,c=US (old)

Is this a bug in slapd, or a feature?  If a feature, I'd hazard it might be
possible to workaround this somewhat by having Rq_add in slurpd keep the queue
sorted; if this seems plausible, I could try patching this.  If the official
position is that the fix is syncrepl, I'd argue that slurpd is now sufficiently
broken that test007-replication ought to be removed and slurpd not built by
default in 2.2.