[Date Prev][Date Next]
Re: openldap hangs on bdb / database corruption
-----BEGIN PGP SIGNED MESSAGE-----
Ingo Steuwer wrote:
| Am Fr, den 11.06.2004 schrieb Ingo Steuwer um 12:38:
|>>|>From time to time the database seems to hang, ldapsearch gets no
|>>| and even db4.2_stat hangs at some point (needs kill -9 then). In this
|>>| cases I need to stop slapd and do a db4.2_recover.
|>>Looks like you might be exceeding 2GB of transaction logs which are
|>>| relevant configruation-parts:
|>>| sizelimit unlimited
|>>| modulepath /usr/lib/ldap
|>>| moduleload back_bdb.so
|>>| database bdb
|>>| cachesize 500000
|>>| objectClass,uidNumber,gidNumber,memberUid,ou,uniqueMember pres,eq
|>>| index uid,cn,sn,givenName,mail,description,displayName
|>>| index sambaSID,sambaPrimaryGroupSID,sambaDomainName eq
|>>| index default sub
|>>You don't appear to have a checkpoint setting, which would mean that all
|>>your transaction logs are open (or something to that effect, based on
|>>what I've seen).
|>That's true, I added "checkpoint 1024 1" now in slapd.conf. It may take
|>some days to see if it is the reason.
|>But is it true that the log-files which contains only 10MB each are held
|>open ? I thought this would be done only for the last=actual one, I
|>mean, for what reason should thy all be open ?
|>Well, as I have 211 logfiles now they are bigger than 2G all together.
| It made it worse. As long as I have the checkpoint-Option set (I tried
| also checkpoint 2048 2) I had a cpu-eating slapd for _each_ ldapmodify.
| The modifications were done, but the correspondig slapd-process goes
Yep, if you have no transaction log settings, you will see this. Your
checkpoint time is probably a bit too low (checkpointing too often).
| strace shows me the already meantioned sched_yield()-calls, but the
| more mysterous is that such a process finishes after I try "ltrace -p"
| on it.
| Well, and after all it corrupted my database again. Back to the defaults
| and after doing some changes there is one CPU-eating slapd which will
| not finish with ltrace (it gives me countless
| "ldap_pvt_thread_yield(0xbfffdf48, 0x40009e90, 0, 0x6044a510,
| 0x40117f60) = 0") but will also resist an
| "/etc/init.d/slapd stop" so I have to do a "kill -9". After a new start
| I get "Implementation Specific Error"s on an ldapdelete.
| So db_recover is needed again.
Yes, after any non-graceful shutdown of slapd you will likely need to
run database recovery.
But, I found that database recovery would not finish if there were more
than 2GB active transaction log files.
I ran automated tests with imports of a 250000 entry database, cluster
fail-over (approx 8 fail-overs) while running 12 simultaneous clients
for an extended period, removal of the database files, starting database
imports again, and succeeded with about 20 cycles like this without
needing catastrophic database recovery. But, I am running normal
database recovery on startup of openldap (thus during every failover).
You need to tune both the checkpoint settings, as well as the
transaction log settings to achieve the database performance you need.
For reference, the settings I was running with for the tests above were
checkpoint 1024 30
set_cachesize 0 536870912 1
But, you *must* do some testing in your environment.
The tests were run on RHEL2.1 with openldap-2.1.25/2.1.29/2.1.30 on
Buchan Milne Senior Support Technician
Obsidian Systems http://www.obsidian.co.za
B.Eng RHCE (803004789010797)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----