Re: Persistent sessionlog

On Tue, Oct 24, 2017 at 06:45:42PM +0200, Ondřej Kuzník wrote:
On Tue, Oct 24, 2017 at 04:52:57PM +0100, Howard Chu:
Ondřej Kuzník:
>>> ITS#8486 suggests we use a more efficient structure to maintain the
>>> sessionlog in. If we're messing with sessionlog already, we might as
>>> well see if we can address another issue - it is always empty on slapd
>>> startup leading to unnecessary full refreshes happening.
>>> slapo-accesslog has most of the data we need to support that and is
>>> already sorted in CSN order (much like sessionlog).
>>> AFAIK, we can't use the accesslog database directly as the database as
>>> we can't efficiently search on a single serverID to get the serverID
>>> set and the oldest CSN for each.
>> We could tweak the overlay to always maintain these in the parent entry
>> (auditContainer). Currently the logpurge always sets the container's
>> entryCSN to the oldest remaining CSN.
> I'll look into that again, what you say sounds feasible. I should be
> close to having the code that populates sessionlog from accesslog. When
> that works, it should be possible to reuse most of that to try and use
> accesslog directly.

The work to load the sessionlog from an accesslog database is here:

The control to receive entries in the reverse order turned out something
I did not manage to succeed in doing, however, so the above is only
part-way to a full solution. I haven't worked on the B+tree suggestion

To use accesslog DB directly, the following are needed:
- maintain the mincsn inside accesslog[0]
  - if mincsn is not set on startup, take the lowest CSN recorded for
    each serverID
  - whenever we encounter a new serverID, record it both in contextCSN
    and mincsn
  - while purging entries, before each entry is removed, the mincsn
    should be updated with purged entryCSN, without transactions this is
    the only safe way, with transactions, we can just remove them in
    batches and record the new mincsn set just before we commit
- update syncprov_op_search to read the mincsn from the audit container,
  not sl_mincsn
- update syncprov_playlog to run a search on the accesslog database
  - we still need the list of entries that have disappeared between the
    last sync and when the persistent search starts, so we filter on:
    - objectclass auditWriteObject or auditExtended (we could ignore add
    - entryCSN in the range from lowest CSN in the cookie to highest CSN
      in the contextCSN at the time of the persistent search

Not sure how to prevent accesslog purge from overtaking this search or
how to detect this happened and switch to a full refresh in that case,
that is without the overlays communicating in some way.

Is it a concern that we run a search (for each entryUUID in our DB)
within a search (for the accesslog entries)? There is a note about
ITS#3456 in syncprov that sounds relevant.

[0]. mincsn is the oldest CSN set that can be safely served by the
     sessionlog: for each serverID, the last CSN expired from the log,
     oldest CSN in the database or the entry from contextCSN

Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP