[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: syncrepl_del_nonpresent leading to directory implosion

To: Howard Chu <hyc@symas.com>, "openldap-technical@openldap.org" <openldap-technical@openldap.org>
Subject: RE: syncrepl_del_nonpresent leading to directory implosion
From: Aaron Bennett <abennett@clarku.edu>
Date: Thu, 10 Jul 2014 13:16:51 +0000
Accept-language: en-US
Content-language: en-US
In-reply-to: <53BC1744.3070903@symas.com>
References: <A3F98FF07EAAD840ADBD429BEA175590270EF5D1@PERCH.ad.clarku.edu> <53BC1744.3070903@symas.com>
Thread-index: Ac+avKZAWCDfvUpsQ6+eVHKi8E/shQAK5ocAAFXyuRA=
Thread-topic: syncrepl_del_nonpresent leading to directory implosion

-----Original Message-----
From: Howard Chu [mailto:hyc@symas.com] 
Sent: Tuesday, July 8, 2014 12:08 PM
To: Aaron Bennett; openldap-technical@openldap.org
Subject: Re: syncrepl_del_nonpresent leading to directory implosion

>You have a bad misconfig in your BDB, you need to configure a larger lock table.

>and the syncrepl code probably needs to be fixed to quit if the backend returns an LDAP_OTHER error code like this. Please submit an ITS for this.

Thank you -- I will submit an ITS once the dust has cleared.

I adjusted our lock table significantly upwards.  However, since then, I'm seeing extreme replication delays in just one direction.  I'm not sure if there's an issue with BDB or if I did something else wrong.  Here's the chain of events:

Two boxes are called animal and zoot.

1.  DB_CONFIG looks like this on both boxes:
	set_cachesize 0 524288000 1
	set_lg_regionmax 262144
	set_lg_bsize 2097152
	set_flags DB_LOG_AUTOREMOVE
	set_lk_max_lockers 9000
	set_lk_max_locks 9000
	set_lk_max_objects 9000
	set_lk_partitions 900

2.  To make sure I got a clean database, during a maintenance window I stopped slapd on both machines, deleted /var/lib/ldap/* , and reloaded an ldif on animal with this command: 
	sudo -u ldap /usr/local/sbin/slapadd -S 1 -w -l /tmp/animal-now.ldif

3.  Then I started slapd and watched zoot had pulled the replica over.  Everything seemed kosher -- replication from animal -> zoot is nearly instant as expected, but zoot -> animal is very slow -- like, 15-30 second and often longer then 60, for one very trivial change.

Any ideas what's wrong?  If you'd like I can send syncrepl logs or db_stat output.  

Thanks for your time,

Aaron Bennett

----
Manager of Systems Administration
Clark University ITS

References:
- syncrepl_del_nonpresent leading to directory implosion
  - From: Aaron Bennett <abennett@clarku.edu>
- Re: syncrepl_del_nonpresent leading to directory implosion
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Outlook LDAP Addressbook Browsing - Add ORDERING caseIgnoreOrderingMatch on CN or NAME core attributes without source code modification and re-compilition
Next by Date: Re: Adding and attribute and editing a matchingRuleUse in the subschema
Index(es):
- Chronological
- Thread