[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: syncrepl: large datasets and expediting consumer's initialization



My DB_CONFIG:
set_cachesize 0 268435456 1
set_lg_regionmax 262144
set_lg_bsize 2097152
set_lg_dir logs


The filesystem is ext3 on RHEL5.

-q enable quick (fewer integrity checks) mode. Does fewer consis- tency checks on the input data, and no consistency checks when writing the database. Improves the load time but if any errors or interruptions occur the resulting database will be unusable.


That last bit was enough for me not to use the -q, but it did reduce load time to 17 minutes.

The performance of slapadd is significant, but what about syncrepl? Why is the consumer reviewing every object? Reviewing "-q", I discovered

-w write syncrepl context information. After all entries are added, the contextCSN will be updated with the greatest CSN in
        the database.

And that looks like an option that would prime my syncrepl info. So

	slapadd -q -w -l SLAPCAT.LDIF

took 14 minutes to build and then 3 minutes to close the databases. This consumer has the same hardware as the provider that took 35 minutes to rebuild the database.

That "slapadd -w" looks like the fix. Would someone confirm or reject that?

The provider's log file still shows it's reviewing many records. I guess it's not returning them. Will the log file show the DNs of results (as opposed to visited)?

I restarted the provider with less logging; logs of a full syncrepl scans are sucking up disk space. Only 5 or 6 records would have changed.

Is it normal for the provider to visit many (all?) objects even when the consumer would have a very current CSN?

Thanks for your help,

Paul