[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: syncrepl consumer is slow



Le 03/02/15 05:11, Howard Chu a écrit :
> Hallvard Breien Furuseth wrote:
>> On 29. jan. 2015 04:12, Howard Chu wrote:
>>> I'm considering adding an option to the consumer to write its entries
>>> with
>>> dbnosync during the refresh phase. The rationale being, there's
>>> nothing to
>>> lose anyway if the refresh is interrupted. I.e., the consumer can't
>>> update
>>> its contextCSN until the very end of the refresh, so any partial
>>> refresh that
>>> gets interrupted is wasted effort - the consumer will always have to
>>> start
>>> over from the beginning on its next refresh attempt.
>>
>> dbnosync loses consistency after a system crash, and it loses the
>> knowledge
>> that the DB may be inconsistent.  At least with back-mdb.   The safe
>> thing
>> to do after such a crash is to throw away the DB and fetch the entire
>> thing
>> from the provider.  Which I gather would need to happen automatically
>> with such an option.
>>
> Another option here is simply to perform batching. Now that we have
> the TXN api exposed in the backend interface, we could just batch up
> e.g. 500 entries per txn. much like slapadd -q already does.
> Ultimately we ought to be able to get syncrepl refresh to occur at
> nearly the same speed as slapadd -q.

Batching is ok, except that you never know how many entries you'll going
to have, thus you will have to actually write the data after a period of
time, even if you don't have the 500 entries.

This is where it would be cool to extend the cookie to receive the
expected number of updates you are going to receive (which will be
obviously be 1 in a normal running R&P replication, but > 1 most of the
time when reconnecting). In this case, youc an anticipate the batching
operation without having to tke care of the time issue.

My 2 cts.