[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#5480) Two object deletions missed in sync replication after daemons restarted



Full_Name: Marian Eichholz
Version: HEAD-20080411
OS: linux 2.6.23.13
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (194.97.7.65)


Like in ITS#5371 we again missed two object deletions, approx. 4 minutes after
provider and consumer were restarted (after stopped for a slapcat). This is too
bad, because the cvs source from 20080411 at least is far more robust in terms
of connection management than the release of openldap-2.4.8 (see ITS#5463)

We now had full debug log on (any), and as far as I can see there is no obvious
rejections or any other reason, why this deletions were not replicated to the
two consumers. There were heading and trailing deletions in the same second, and
I cannot see structural anomalies in the residual objects.

The consumer databases were cloned from the provider after the incident that
lead to ITS#5371. When we checked the consistency on Friday morning, the
databases were identical, measured by the set of "dn:"-records in the dump. Of
course we first stopped the provider, then the consumers. When we (re-) start
the LDAP services, we first start the provider, then the consumers.

Any idea how we can debug the problem or can get a realiable release?

When needed and useful, I can provide a (heavily edited, for privacy reasons)
excerpt from the debug log of the given second.

Thank You in advance! - Marian Eichholz