[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: CSN of delete operations



Rein Tollevik wrote:
The bdb/hdb and ldif backends assigns CSNs to delete operations that
lacks it, which causes problems in forwarding replication
configurations.  During the refresh phase there may be legitimate delete
operations that should not have any CSN.  When the forwarder adds its
CSN it might leave the forwarded and its consumers with a CSN set that
includes a SID not present on the provider, and they will never be able
to resync.

OK, sounds like that should be fixed.

syncrepl_del_nonpresent() queues the minimum CSN received from the
provider, which partly obscures this problem but in return introduce
other :-(  The CSN set received may include updates to more than one
CSN, and only one if these can be added on the queue.  Much worse, the
first delete will commit the queued CSN.  If there are more than one
entry that should be deleted this leaves an open window where the
forwarder (and its consumers) have an apparently up-to-date CSN set
without actually being in sync with the provider.  Running the new
test061 with sync debugging shows traces of these problem in the logs.

In back-bdb/delete.c, the CSN of the delete operation appear to be added
as a value in the entryCSN index, which really puzzles me.  If that
index is to be modified I would expect that it should delete the
entryCSN value of the entry being deleted, not to add anything.  Why
this is only done in non-shadowed databases I cannot tell either.

bdb_index_entry_del is already invoked to remove all appropriate index values. IIRC, this particular patch was done to accomodate the entryCSN>=foo search that syncprov performs. Probably this only matters if a Delete was the last operation on a DB just before shutdown. Hard to say if it's still relevant.

I would fix these problems by assigning the CSN of delete operations in
the frontend, i.e on the server where ordinary delete operations where
done.  syncrepl_del_nonpresent() should not queue the CSN, updating it
should be left to the syncrepl_updateCookie() call which takes place
when the refresh phase completes.  But what to do about the index
manipulation I cannot tell. Anyone?

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/