[Date Prev][Date Next] [Chronological] [Thread] [Top]

BDB recover problem after unnormally system shutdown (ITS#3355)



Full_Name: Attila Szuts
Version: 2.2.13
OS: Debian Sarge
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (195.56.97.217)


BDB 4.2.52
LDAP: 2.2.13 (But I think it is in newer versions.)

I don't if anyone had similar problem, but it happened to me multiple times at
the past...
I run OpenLDAP on a mail server as user database server. I modify ldap entries
with some PHP
scripts.

slapd.conf:
-----------
# Features to permit
allow bind_v2

include   /etc/ldap/schema/core.schema
include   /etc/ldap/schema/cosine.schema
include   /etc/ldap/schema/nis.schema
include   /etc/ldap/schema/inetorgperson.schema

schemacheck     on
pidfile         /var/run/slapd.pid
argsfile        /var/run/slapd.args

backend   bdb
database  bdb
checkpoint 10 2
cachesize 1000

suffix          "dc=nowhere,dc=com"
directory       <path_to_db_dir>
index           objectClass pres,eq
lastmod         on
sizelimit -1

#And some ACLs...
-------------------------

BDB DB_CONFIG file: (reside in slapd.conf -> directory)
-------------------
set_lk_max_locks   2000
set_lk_max_lockers 2000
set_lk_max_objects 2000

set_lg_bsize    1048576
set_lg_max      6291456
-------------------------

I thought that my config settings will be automatically committing any changes
in BDB after
2 minutes or after 10kb.
Then came an unplanned system shutdown(caused by eletrical breakdown without
notice, but
hours later than last ldap entry has modified).
After booting slapd was normally restarted by initd.

All works fine, except I could not see my last few changes in ldap.

I repaired database with db_recover -v , but it doesn't help.
(When I used db_recover -v -c I became this error message:
  db_recover: Finding last valid log LSN: file: 1 offset 1079544
  db_recover: Recovery starting from [1][28]
  db_recover: Log sequence error: page LSN 1 1052620; previous LSN 1 1095735
  db_recover: Recovery function for LSN 1 1052620 failed on forward pass
  db_recover: PANIC: Invalid argument
  db_recover: PANIC: fatal region error detected; run recovery
)

I moved the whole database to another place and vi-ed(viewed) the
log.0000000001
file and saw that my modifications(entries) are 100kbyte before EOF.(10 times
more than 10kbyte!)
Then I vi-ed dn2id.bdb and id2entry.bdb files but couldn't find these entries.

Why cannot db_recover *really* recover data from the logfile? I couldn't find
any software that can parse log file or bdb files to see records inner BDB.
(Similar has happened with previous OpenLDAP versions, but I can't reproduce
this error because
this can easily kill my machines. Sorry.)

Am I misconfigured something, or Berkeley DB is faulty or something else? What
shall I do in the
same situation?
I don't know exactly how OpenLDAP with BDB or standalone BDB internally works,
if this seems like BDB bug then sorry this bugreport and please give me some
instruction how/what to test or post to BDB buglist.

Thanks
Attila

P.S: Another BDB problem was when disk space runs out: slapd and BDB crashes,
and can't recover.