[Date Prev][Date Next] [Chronological] [Thread] [Top]

openldap2.4.16 and BDB4.7 not sync configured as provider/consumer



openldap software,

Sometime ago I open the ITS#5860 about some memory cache limitations not 
being respected by config files. Even this issue was solved when I tried 
to configured openldap to use replication(syncrepl) the system never 
enter into sync and the behavior appears similar to the ITS#5860 bug.

The system start to sync and in the provider(master) I see the query for 
the DB sync. But the consumer(slave) memory consumption start to grow 
very fast making me to constrain much more the dncachesize to a 1/10 of 
the size of the provider(master) where at least system doesn't crash at 
consumer.

Since changes were done in the openldap 2.4.16 I download and made tests 
with this version. I get into the same behavior with consumer(slave) 
never getting in sync with provider(master).

The behaviors are :

1) Consumer(slave) start query to the provider(master) DB;
2) Memory allocation and number of threads in the provider(master) start 
to increase as expected;
3) dncachesize directive into provider(master) controls as expected the 
maximum memory to be allocated by slapd process in provider(master);
4) Consumer(slave) consumer memory in a much faster pace. dncachesize 
configured to 1/10 of provider(master) to avoid memory allocation problems;
5) After sometime the consumer(slave) CPU usage maintains in 200%. 
Provider(master) stays with low CPU usage, around 1 to 3 %;
6) A new provisioning in provider(master) isn't propagated to 
consumer(slave);
7) Bases never get in sync and CPU usage in consumer still high. Queries 
to provider(master) are answer very fast and even multiple individual 
queries to consumer(slave) are also answer in reasonable time.

It looks like could exist certain issue in the replication logic where 
some processing dead loop could be found by the replication 
consumer(slave) logic.

The newest openldap version and Berkeley DB 4.7 with all patches were 
compiled in the platform running the code.

Any idea about this behavior?

Thanks,

Rodrigo.