[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: BDB corruption on windows port of 2.2.19



You mention below that a proper bdb configuration should prevent the
corruption incident from happening. Would you be able to recommend a
bdb configuration (via slapd.conf) that I could use? I tried adding
"checkpoint 0 1" in my slapd.conf file but the data corruption is
still reproducible reliably...

database	bdb
suffix		"dc=XYZ,dc=com"
rootdn		"cn=Manager,dc=XYZ,dc=com"
checkpoint 0 1


My usage pattern is as follows:
- At install time, the installer for our product sets up OpenLDAP and
imports some seed data into it (using slapadd), under a specific base
dn. This imported data (LDIF file) could be pretty large. This same
import will happen again at a periodic rate of X hours (most likely 24
hours). Each time the import occurs, it involves the deletion of the
previous imported data (if any) using ldapdelete, followed by
re-import of the updated LDIF data file (using slapadd), followed by a
re-start of the OpenLDAP server.
- Apart from this import, there will be very rare
modification/addition of entries in a different subtree. This is
anticipated to involve very little data and the server will not be
restarted during this type of access.
- The rest of the time, the directory is only accessed for running
searches across all the data contained under it.

Given this usage pattern, would you or anyone else be able to
suggested some appropriate bdb configuration settings (inside
slapd.conf) that I could use to prevent data corruption from an
abnormal system shutdown?

Thanks in advance!

Cheers,
Safdar

On 7/18/05, Matthew Hardin <mhardin@symas.com> wrote:
> Hi Safdar,
> 
> > Hi,
> >
> > I saw several posts online about BDB corruption occuring on OpenLDAP
> > servers if there is an abnormal system shutdown etc. Based on what
> > I've gathered, it seems that version 2.2.19 should not be facing these
> > types of corruption issues. However, I am using the windows port of
> > OpenLDAP 2.2.19 and I see these issues whenever the system goes into
> > standby/hibernation, or if there is an abnormal system shutdown.
> >
> > Sometimes I see data loss where an entire subtree of my directory
> > vanishes. At other times the server hangs with 100% CPU utilization by
> > the slapd.exe executable, while at other times the server fails to
> > startup alltogether. All these problems are fixed when the database
> > files (__db.00n or *.bdb) are deleted and I reimport all the data from
> > my backup ldifs, without touching any other file in the OpenLDAP
> > server installation...
> 
> I'm sorry to hear you are having problems with back-bdb.
> 
> In our experience, correctly configured OpenLDAP databases based on back-bdb
> rarely become corrupted, even from abnormal system or application shutdowns.
> I believe that many of the posts you refer to are cases of improperly
> interpreting the behavior that is seen after an abnormal shutdown.
> 
> Back-bdb uses the transactional features of the Berkeley database. This
> means that operations on the database operations are written into
> transaction log files before they are applied to the db itself. Fine-grained
> locks are also used to provide a high degree of concurrency. When the
> database is shut down abnormally it is possible that there are uncommitted
> transactions left over, and it is certain that there are locks left over.
> Restarting slapd without properly cleaning up leftover locks and applying
> uncommitted transactions will result in the behaviors you described.
> 
> After a crash of either slapd or of the system itself the database must be
> properly cleaned up before it is used again. The Berkeley db utility
> db_recover takes care of this for you. For OpenLDAP 2.2 it is necessary to
> run db_recover on each database whenever the system was shut down improperly
> (i.e., slapd crashes or the system crashes). We (Symas) added code to
> OpenLDAP 2.3 to detect and correct these conditions automatically, so it is
> NOT necessary to worry about database recovery in OpenLDAP 2.3. I should
> note also that our binary distribution of OpenLDAP, CDS version 2 (based on
> OpenLDAP 2.2.x), has this feature as well.
> 
> If you run db_recover on your back_bdb databases after a system crash, I
> doubt you'll ever run into this problem again.
> 
> One other thing to look at is the "checkpoint" parameter, which determines
> how often the database transaction log buffers are flushed to disk. See
> slapd.conf(5) for additional information.
> 
> I'm sorry, but I can't comment on the problems you are seeing with
> hibernation/standby. I would guess, though, that since slapd is designed to
> run on servers, which do not hibernate or standby, that there are no
> provisions to handle these events.
> 
> I hope this helps.
> 
> Matthew Hardin
> Symas Corporation
> Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
> http://www.symas.com
> 
> 
> 
> 
>