[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: BDB corruption on windows port of 2.2.19



Thanks for the information Matthew.

I had a follow-up question. I obtained a windows version of db_recover
online (since it did not come packaged with the windows OpenLDAP
installation). There is a var\openldap-data folder under OpenLDAP
under which I have these db files:

07/21/2005  06:50 PM            81,920 dn2id.bdb
07/21/2005  06:50 PM         1,015,808 id2entry.bdb
07/21/2005  06:50 PM        10,485,760 log.0000000001
07/21/2005  06:50 PM            32,768 objectClass.bdb
07/21/2005  06:49 PM            16,384 __db.001
07/21/2005  06:49 PM           270,336 __db.002
07/21/2005  06:49 PM            98,304 __db.003
07/21/2005  06:49 PM           376,832 __db.004
07/21/2005  06:49 PM            24,576 __db.005
               9 File(s)     12,402,688 bytes
               2 Dir(s)  24,146,268,160 bytes free]

I ran the db_recover utility as below:

C:\Program Files\OpenLDAP>db_recover -v -h var\openldap-data
db_recover: unable to join the environment
db_recover: unlink: var\openldap-data\__db.005: Permission denied
db_recover: unlink: var\openldap-data\__db.004: Permission denied
db_recover: unlink: var\openldap-data\__db.003: Permission denied
db_recover: unlink: var\openldap-data\__db.002: Permission denied
db_recover: unlink: var\openldap-data\__db.001: Permission denied
db_recover: Ignoring log file: var\openldap-data\log.0000000001:
unsupported log version 10
db_recover: Invalid log file: log.0000000001: Invalid argument
db_recover: PANIC: Invalid argument
db_recover: PANIC: DB_RUNRECOVERY: Fatal error, run database recovery
db_recover: fatal region error detected; run recovery
db_recover: unable to join the environment
db_recover: unlink: var\openldap-data\__db.005: Permission denied
db_recover: unlink: var\openldap-data\__db.004: Permission denied
db_recover: unlink: var\openldap-data\__db.003: Permission denied
db_recover: unlink: var\openldap-data\__db.002: Permission denied
db_recover: unlink: var\openldap-data\__db.001: Permission denied
db_recover: DB_ENV->open: DB_RUNRECOVERY: Fatal error, run database recovery

Would you know why I'm getting all these errors? Am I using the wrong
version of db_recover? Or is the home directory supposed to be
different when running db_recover under the OpenLDAP folder?

Thanks in advance,
Cheers,
Safdar


On 7/18/05, Matthew Hardin <mhardin@symas.com> wrote:
> Hi Safdar,
> 
> > Hi,
> >
> > I saw several posts online about BDB corruption occuring on OpenLDAP
> > servers if there is an abnormal system shutdown etc. Based on what
> > I've gathered, it seems that version 2.2.19 should not be facing these
> > types of corruption issues. However, I am using the windows port of
> > OpenLDAP 2.2.19 and I see these issues whenever the system goes into
> > standby/hibernation, or if there is an abnormal system shutdown.
> >
> > Sometimes I see data loss where an entire subtree of my directory
> > vanishes. At other times the server hangs with 100% CPU utilization by
> > the slapd.exe executable, while at other times the server fails to
> > startup alltogether. All these problems are fixed when the database
> > files (__db.00n or *.bdb) are deleted and I reimport all the data from
> > my backup ldifs, without touching any other file in the OpenLDAP
> > server installation...
> 
> I'm sorry to hear you are having problems with back-bdb.
> 
> In our experience, correctly configured OpenLDAP databases based on back-bdb
> rarely become corrupted, even from abnormal system or application shutdowns.
> I believe that many of the posts you refer to are cases of improperly
> interpreting the behavior that is seen after an abnormal shutdown.
> 
> Back-bdb uses the transactional features of the Berkeley database. This
> means that operations on the database operations are written into
> transaction log files before they are applied to the db itself. Fine-grained
> locks are also used to provide a high degree of concurrency. When the
> database is shut down abnormally it is possible that there are uncommitted
> transactions left over, and it is certain that there are locks left over.
> Restarting slapd without properly cleaning up leftover locks and applying
> uncommitted transactions will result in the behaviors you described.
> 
> After a crash of either slapd or of the system itself the database must be
> properly cleaned up before it is used again. The Berkeley db utility
> db_recover takes care of this for you. For OpenLDAP 2.2 it is necessary to
> run db_recover on each database whenever the system was shut down improperly
> (i.e., slapd crashes or the system crashes). We (Symas) added code to
> OpenLDAP 2.3 to detect and correct these conditions automatically, so it is
> NOT necessary to worry about database recovery in OpenLDAP 2.3. I should
> note also that our binary distribution of OpenLDAP, CDS version 2 (based on
> OpenLDAP 2.2.x), has this feature as well.
> 
> If you run db_recover on your back_bdb databases after a system crash, I
> doubt you'll ever run into this problem again.
> 
> One other thing to look at is the "checkpoint" parameter, which determines
> how often the database transaction log buffers are flushed to disk. See
> slapd.conf(5) for additional information.
> 
> I'm sorry, but I can't comment on the problems you are seeing with
> hibernation/standby. I would guess, though, that since slapd is designed to
> run on servers, which do not hibernate or standby, that there are no
> provisions to handle these events.
> 
> I hope this helps.
> 
> Matthew Hardin
> Symas Corporation
> Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
> http://www.symas.com
> 
> 
> 
> 
>