[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#5488) syncrepl received contextCSN not passed on to syncprov consumers



On Wed, 30 Apr 2008, Howard Chu wrote:

> rein@OpenLDAP.org wrote:

>> When syncrepl and syncprov are both used on a glue database, the
>> contextCSN received from the syncrepl producers are not passed on to the
>> syncprov consumers when changes in subordinate databases are received.
>> The reason is that syncrepl queues the CSNs in the glue backend, while
>> syncprov fetches them from the backend where the changes are made.  As a
>> consequence, the consumers will be passed a cookie without any csn
>> value.
>>
>> My first attempt at fixing this was to change syncprov to fetch the
>> queued csn values from the glue backend where it was used.  But that
>> failed as other modules queues the csn values in their own backend when
>> they changes things.
>
> What other modules? Generally there cannot be any other sources of changes.

Sorry, I should have written other configurations.  The CSNs gets queued
in the subordinate database when syncrepl is used there, or not at all
(i.e in regular updates that comes in through the frontend).

>> Instead I changed ctxcsn.c so that it always
>> queues them in the glue backend where syncprov is used.  But I don't
>> feel that my understanding of this stuff is good enough to be sure that
>> this is the optimal solution..
>
> I definitely don't like references to the syncprov overlay appearing in main
> slapd code like that. We need a different solution.

That's reasonable, but the test for syncrepl is probably not needed if
this solution should be kept.  The test was more or less a copy and
paste from syncrepl where it finds out which backend to write through.
To me it makes sense to have a single queue of CSN values in a glued
configuration, no matter if or where syncprov is used.

> At one point in the past, I had changed syncrepl.c to queue the CSNs in
> both places, but that seemed rather sloppy. Still, it may work best here.

I don't like duplicating information, sooner or later it tends to end up
with wrong info in one of the places..

Another approach could be to have syncprov look in the glue database if
it fails to find any queued CSN in a subordinate db.  I haven't tested
it, but that should work in both configurations.  It should also remove
the need to always look for the glue db which my patch requires.  Would
that be better?

>> Btw, in syncprov_checkpoint() there is a similar SLAP_GLUE_SUBORDINATE
>> test, should that have included an overlay_is_inst() clause as well?
>
> Perhaps. You would have to use op->o_bd->bd_self instead of op->o_bd on
> that call.

The current test (introduced to fix ITS#5433) causes the contextCSN to
be written to the glue database when syncprov is used on a subordinate
db, which appears wrong to me.

Could you elaborate on when op->o_bd->bd_self must be used instead of
op->o_bd?  I understand that op->o_bd may be a copy of the original
structure that op->o_bd->bd_self refers to, but I'm not sure when it
must be used.  Btw, could op->o_bd->bd_self->bd_info be used to fetch
the BackendInfo that can be used to call the top-most bd_search (and
similar) also in overlays?

Rein