[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: syncrepl with accesslog database

On Thu, Apr 03, 2008 at 01:53:17PM -0700, Howard Chu wrote:
> John Morrissey wrote:
> >On Thu, Apr 03, 2008 at 12:10:49AM -0700, Howard Chu wrote:
> > >Yes, it seems a bit silly to explicitly replicate it; that means the
> > >producer has to send its contents twice to each delta-sync consumer.
> > >Just configure an accesslog overlay on each consumer and let them
> > >regenerate the log locally.
> >
> > Won't configuring a separate accesslog on the consumers result in
> > slightly different accesslogs (relative to each other and to the
> > provider) being generated, since there is inherent latency between the
> > writes on the provider and the writes on the slaves? Also, won't the
> > reqAuthzID in the consumers' accesslogs be different, since the syncrepl
> > engine uses the rootdn internally to make the changes to consumers'
> > databases?
> There is write latency no matter what. If you interrupt things before an 
> operation can be logged, then of course it will be different, and explicit 
> replication won't change that fact.

I'm not contesting that. The write latency I'm referring to is between the
time of the write (and therefore the relevant timestamp(s)) on the provider
and consumers if the consumers generate their own access logs.

> Yes, the reqAuthzID will be different. But the replication mechanisms don't 
> care about that; the modifiersName opattr is recorded and replicated 
> explicitly. All of the other attributes will be identical.

Of course, delta-syncrepl will be just fine if one of the
independently-accesslogging consumers is promoted to provider.

But what if someone wants to use the accesslog data for something else,
something that *does* depend on reqAuthzID? Granted, it would be the perfect
storm for a consumer to be promoted *and* someone happened to use the
accesslog for something other than delta-syncrepl *and* that other use cared
about the differences, but allowing the two databases to differ in this
manner could be easily overlooked.

I'd like to avoid that situation entirely if it's reasonably easy to do so,
and replicating the accesslog database from the master seems
straightforward. It doesn't matter to me that the data are transferred
twice; the bandwidth consumed by syncrepl in this situation is so small as
to be easily dwarfed by all manner of other traffic on our networks.

That aside, the accesslog overlay does indeed not set an entryUUID on the
database it creates. Only the contextCSN is carried over (as the entryCSN,
if I'm reading accesslog_db_open() in servers/slapd/overlays/accesslog.c

    /* Get contextCSN from main DB */
    op->o_bd = be;
    op->o_bd->bd_info = on->on_info->oi_orig;
    rc = be_entry_get_rw( op, be->be_nsuffix, NULL,
        slap_schema.si_ad_contextCSN, 0, &e_ctx );

    if ( e_ctx ) {
        Attribute *a;

        a = attr_find( e_ctx->e_attrs, slap_schema.si_ad_contextCSN );
        if ( a ) {
        attr_merge( e, slap_schema.si_ad_entryCSN, a->a_vals, NULL );
        attr_merge( e, a->a_desc, a->a_vals, NULL );
        be_entry_release_rw( op, e_ctx, 0 );
    op->o_bd->bd_info = (BackendInfo *)on;
    op->o_bd = li->li_db;

entryUUID is NO-USER-MODIFICATION, so I made creative use of the updatedn
directive on the master to add an entryUUID to the accesslog database. Our
consumers then started replicating that database on their own.

If it's the appropriate place, could accesslog_db_open() be modified to add
an entryUUID when creating its database? I don't see a reason this shouldn't
Just Work.

John Morrissey          _o            /\         ----  __o
jwm@horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__