[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: BDB corruption on windows port of 2.2.19



Hi Safdar,

> Hi,
> 
> I saw several posts online about BDB corruption occuring on OpenLDAP
> servers if there is an abnormal system shutdown etc. Based on what
> I've gathered, it seems that version 2.2.19 should not be facing these
> types of corruption issues. However, I am using the windows port of
> OpenLDAP 2.2.19 and I see these issues whenever the system goes into
> standby/hibernation, or if there is an abnormal system shutdown.
> 
> Sometimes I see data loss where an entire subtree of my directory
> vanishes. At other times the server hangs with 100% CPU utilization by
> the slapd.exe executable, while at other times the server fails to
> startup alltogether. All these problems are fixed when the database
> files (__db.00n or *.bdb) are deleted and I reimport all the data from
> my backup ldifs, without touching any other file in the OpenLDAP
> server installation...

I'm sorry to hear you are having problems with back-bdb.

In our experience, correctly configured OpenLDAP databases based on back-bdb
rarely become corrupted, even from abnormal system or application shutdowns.
I believe that many of the posts you refer to are cases of improperly
interpreting the behavior that is seen after an abnormal shutdown.

Back-bdb uses the transactional features of the Berkeley database. This
means that operations on the database operations are written into
transaction log files before they are applied to the db itself. Fine-grained
locks are also used to provide a high degree of concurrency. When the
database is shut down abnormally it is possible that there are uncommitted
transactions left over, and it is certain that there are locks left over.
Restarting slapd without properly cleaning up leftover locks and applying
uncommitted transactions will result in the behaviors you described.

After a crash of either slapd or of the system itself the database must be
properly cleaned up before it is used again. The Berkeley db utility
db_recover takes care of this for you. For OpenLDAP 2.2 it is necessary to
run db_recover on each database whenever the system was shut down improperly
(i.e., slapd crashes or the system crashes). We (Symas) added code to
OpenLDAP 2.3 to detect and correct these conditions automatically, so it is
NOT necessary to worry about database recovery in OpenLDAP 2.3. I should
note also that our binary distribution of OpenLDAP, CDS version 2 (based on
OpenLDAP 2.2.x), has this feature as well.

If you run db_recover on your back_bdb databases after a system crash, I
doubt you'll ever run into this problem again.

One other thing to look at is the "checkpoint" parameter, which determines
how often the database transaction log buffers are flushed to disk. See
slapd.conf(5) for additional information.

I'm sorry, but I can't comment on the problems you are seeing with
hibernation/standby. I would guess, though, that since slapd is designed to
run on servers, which do not hibernate or standby, that there are no
provisions to handle these events.

I hope this helps.

Matthew Hardin
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
http://www.symas.com