[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Testing the state of replicates



So it seems easy to do this monitoring via some external agent/program.
Can I do something (short of writing an overlay) to get this information
with a ldap query?  i.e. some query which would give me the difference
between the current contextCSN of the machine I'm talking to and the
master server.  AFAICT, the existing overlays won't let me create this
kind of synthesized value.

Alternatively, I think I'd be happy with a query to tell me if the
server thinks it is having trouble talking to the master.
Thanks for everything.
Roy

-----Original Message-----
From: Gavin Henry [mailto:ghenry@suretecsystems.com] 
Sent: Wednesday, March 05, 2008 3:28 AM
To: Aaron Richton
Cc: Marantz, Roy; openldap-software@openldap.org
Subject: RE: Testing the state of replicates

<quote who="Aaron Richton">
> [Gavin says]
>> Dig the main source. servers/slapd/syncrepl.c and
>> servers/slapd/overlays/syncprov.c
>
> Hmm, wrong source files. Try libraries/liblutil/csn.c, which sayeth:
>
>   * These routines are (loosly) based upon
draft-ietf-ldup-model-03.txt,
>   * A WORK IN PROGRESS.  The format will likely change.
>   *
>   * The format of a CSN string is: yyyymmddhhmmssz#s#r#c
>   * where s is a counter of operations within a timeslice, r is
>   * the replica id (normally zero), and c is a counter of
>   * modifications within this operation.  s, r, and c are
>   * represented in hex and zero padded to lengths of 6, 3, and
>   * 6, respectively. (In previous implementations r was only 2
digits.)
>

Ah, many thanks.

>
> We use
> http://www.openldap.org/lists/openldap-software/200602/msg00158.html,
> maybe with a small mod or two (I forget), to check that contextCSN
isn't
> wedged. This only works when the syncrepl thread is completely borked.
A
> better check would be something along the lines of the Net::LDAP
ldifdiff
> to make sure that nothing's different. Of course this has race
condition
> issues (not that we make writes all that often, but on paper at
least). If
> anybody has something like that as a monitoring plugin, you'd erase
one
> line off my perpetual todo list...

;-) Plugin for what?

>
> (Yes, that would be of great interest to me. ~93% of syncrepl bugs
we've
> seen involve very very very slight errors that only result in an entry
or
> two being wrong. contextCSN being wrong...we pretty much only see that
in
> the field when tcp keepalives fail to indicate the need for a
> reconnection.)
>

So the entryCSN would be wrong?