Issue 6406 - back-ldif fails test043-delta-syncrepl
Summary: back-ldif fails test043-delta-syncrepl
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: documentation (show other issues)
Version: 2.4.20
Hardware: All All
: --- normal
Target Milestone: 2.5.1
Assignee: Quanah Gibson-Mount
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-11-30 20:40 UTC by Hallvard Furuseth
Modified: 2021-02-08 17:51 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Hallvard Furuseth 2009-11-30 20:40:04 UTC
Full_Name: Hallvard B Furuseth
Version: 2.4.20
OS: Linux x86_64
URL: http://folk.uio.no/hbf/test043-ldif.tgz
Submission from: (NULL) (129.240.6.233)
Submitted by: hallvard


This often fails (producer and consumer databases differ):
    SLAPD_DEBUG=0x4105 ./run -b ldif test043-delta-syncrepl
when configured with --enable-accesslog.

The error has been the same several times:  server2's object
   cn=Some Staff,ou=Groups,dc=example,dc=com
has the extra attribute
   description: Everyone in the sample data

Test output + testrun/ URL enclosed
Comment 1 Hallvard Furuseth 2009-11-30 20:40:14 UTC
moved from Incoming to Software Bugs
Comment 2 Hallvard Furuseth 2009-12-05 17:11:36 UTC
I wrote:
> This often fails (producer and consumer databases differ):
>     SLAPD_DEBUG=0x4105 ./run -b ldif test043-delta-syncrepl
> when configured with --enable-accesslog.

Apparently syncprov sends changes in the order it gets them from an
accesslog search, so it needs accesslog to return change records ordered
by change CSN.  Otherwise the consumer discards changes, complaining:

do_syncrep2: rid=001 CSN too old, ignoring 20091130202958.564835Z#000000#000#000000

This works with hdb/bdb under normal operation:  They return entries
ordered by entry ID, which always increases as new entries are added.
It works out since something (syncrepl or accesslog) serializes writes
so change records are added to the log database in the same order as
the changes are applied to the main database.

With back-ndb, I imagine somthing like an SQL 'ORDER BY <reqStart or
entryCSN>' will do the trick, if it isn't already handled.

back-ldif does not support this: It returns changes memcmp-ordered
by normalized reqStart.  Normalization "<t>.210000Z" -> "<t>.21Z"
makes that value sort after "<t>.219999Z".


Anyway, if I have understood this correctly, this requirement needs to
be documented, and it looks rather fragile anyway.  Only databases
supporing this use of accesslog.  The admin must not slapcat the
accesslog, rearrange it (maybe he's looking for changes to something
specific), and then slapadd it back.  Not something which would happen
every day, but there's no reason people _wouldn't_ do it either.  LDAP
defines elements as unordered, after all.

What happens if a consumer does a full refresh from a provider without
accesslog, and is also running syncprov with accesslog?  Do these
changes end up in the consumer accesslog?  I don't suppose the refresh
can send entries by entryCSN order, since there may be children that
were changed before parents.

-- 
Hallvard

Comment 3 Howard Chu 2009-12-05 20:03:33 UTC
h.b.furuseth@usit.uio.no wrote:
> I wrote:
>> This often fails (producer and consumer databases differ):
>>      SLAPD_DEBUG=0x4105 ./run -b ldif test043-delta-syncrepl
>> when configured with --enable-accesslog.
>
> Apparently syncprov sends changes in the order it gets them from an
> accesslog search, so it needs accesslog to return change records ordered
> by change CSN.  Otherwise the consumer discards changes, complaining:
>
> do_syncrep2: rid=001 CSN too old, ignoring 20091130202958.564835Z#000000#000#000000
>
> This works with hdb/bdb under normal operation:  They return entries
> ordered by entry ID, which always increases as new entries are added.
> It works out since something (syncrepl or accesslog) serializes writes
> so change records are added to the log database in the same order as
> the changes are applied to the main database.
>
> With back-ndb, I imagine somthing like an SQL 'ORDER BY<reqStart or
> entryCSN>' will do the trick, if it isn't already handled.

back-ndb also orders by entryID.

> back-ldif does not support this: It returns changes memcmp-ordered
> by normalized reqStart.  Normalization "<t>.210000Z" ->  "<t>.21Z"
> makes that value sort after "<t>.219999Z".

> Anyway, if I have understood this correctly, this requirement needs to
> be documented, and it looks rather fragile anyway.  Only databases
> supporing this use of accesslog.  The admin must not slapcat the
> accesslog, rearrange it (maybe he's looking for changes to something
> specific), and then slapadd it back.  Not something which would happen
> every day, but there's no reason people _wouldn't_ do it either.  LDAP
> defines elements as unordered, after all.

I thought it was obvious; a "log" is an ordered transcript of events. If you 
rearrange the sequence of events then you invalidate it.

> What happens if a consumer does a full refresh from a provider without
> accesslog, and is also running syncprov with accesslog?  Do these
> changes end up in the consumer accesslog?  I don't suppose the refresh
> can send entries by entryCSN order, since there may be children that
> were changed before parents.

True, the changes will be written out of order. But while a consumer is in 
refresh mode, it won't be updating its contextCSN. And while the contextCSN 
doesn't change, the provider won't be sending CSNs in the cookie it sends with 
any changes. (And the cookie CSN is what matters, not any entryCSN in the 
entry's attributes.) So any downstream consumer will accept these changes 
implicitly, since there is no CSN to check.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 4 Hallvard Furuseth 2009-12-05 20:52:06 UTC
Howard Chu writes:
>> With back-ndb, I imagine somthing like an SQL 'ORDER BY<reqStart or
>> entryCSN>' will do the trick, if it isn't already handled.
>
> back-ndb also orders by entryID.

Thanks, I'll note that for the test.

>> Anyway, if I have understood this correctly, this requirement needs to
>> be documented, and it looks rather fragile anyway.  Only databases
>> supporing this use of accesslog.  The admin must not slapcat the
>> accesslog, rearrange it (maybe he's looking for changes to something
>> specific), and then slapadd it back.  Not something which would happen
>> every day, but there's no reason people _wouldn't_ do it either.  LDAP
>> defines elements as unordered, after all.
> 
> I thought it was obvious; a "log" is an ordered transcript of events.
> If you rearrange the sequence of events then you invalidate it.

No, it's not obvious that the order of slapadding entires has any
influence on their order returned from search.  The LDAP spec doesn't
say so, nor man 5 slapd-bdb, nor do the syncprov or accesslog manpages
say that they need it.

Obviously *something* need to take care of ordering, but as far as
the user knows that could be accesslog or syncprov.  Actually, the
slapo-accesslog(5) suggestion to index reqStart may give a careful
reader a hint about how the ordering happens.

So anyway, syncprov needs to document that it doesn't ensure this
ordering but depends on the admin getting it right.

Long-term I'd also like some way for syncprov to catch errors if the
admin does get tings like this wrong.  E.g. let accesslog require an
attribute to be set in the log database which promises ordering.  slapd
startup would fail if the database doesn't confirm that it can honor
this flag, and slapadd of entries out of CSN order would also fail.
Unless that breaks what you say about refresh below... anyway, all that
will have to wait for RE25 or later.

-- 
Hallvard

Comment 5 Howard Chu 2009-12-05 22:05:02 UTC
moved from Software Bugs to Documentation
Comment 6 Quanah Gibson-Mount 2021-01-14 18:07:31 UTC
Need to document in slapo-accesslog(5) man page that the backend used to store the database needs to support an ordered return of results.  back-ldif does not support this.
Comment 8 Quanah Gibson-Mount 2021-01-26 19:23:46 UTC
trunk:

  • e768dcd0 
by Quanah Gibson-Mount at 2021-01-26T18:06:05+00:00 
ITS#6406 - Note accesslog storage requirements

Update slapo-accesslog(5) man page to note that the database backend storing the data must support an ordered return of results.