[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd hangs doing large ldap (add|modify|delete)



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thaths wrote:
> Hi,
> 
> I would really like to deploy OpenLDAP throughout my 700-user network.
> However, slapd stops working from time to time randomly. When I run
> slapcat nothing happens and the command hangs. This is really
> frustrating me.
> 
> I have OpenLDAP v 2.2.23-8 installed from Debian package for
> sarge/stable. The "crash" usually happens when I am doing a ldapdelete
> or ldapadd or ldapmodify. It is totally unpredictable. Not knowing how
> to recover from these crashes, I have been re-installing slapd from
> scratch and recreating my users.
> 
> 1. Why does these hangs happen?

You haven't tuned the BDB database environment sufficiently, and maybe
you have shut down slapd uncleanly.

> 2. How can I avoid them?

Tune the BDB database environment correctly, and set up checkpointing.

> 3. Can I recover from such a hang when it happens? How?
> 
> I tried db_recover and it didn't solve the crash/hang. Here is the output:
> 
> jupiter:~# db_recover -c -v -h /var/lib/ldap
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000004:
> unsupported log version 8
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000003:
> unsupported log version 8
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000002:
> unsupported log version 8
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000001:
> unsupported log version 8

The db_recover version issue has been addressed. However, apparently
Debian's init scripts can be configured to run database recovery at
startup, and one would hope the script would know which db_recover to use.

> db_recover: log_get: unable to find checkpoint record: no checkpoint set.

See the checkpoint section of slapd-bdb(5)

> db_recover: Ignoring log file: /var/lib/ldap/log.0000000004:
> unsupported log version 8
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000003:
> unsupported log version 8
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000002:
> unsupported log version 8
> db_recover: Ignoring log file: /var/lib/ldap/log.0000000001:
> unsupported log version 8
> db_recover: Recovery complete at Thu Jan  1 05:30:00 1970
> db_recover: Maximum transaction id 80000000 Recovery checkpoint [0][0]
> 
> If this is the stability level of OpenLDAP, I really hesitate to use
> it in a production environment.

If this is your level of experience with OpenLDAP, I would also
hesitate. Please review all the relevant sections of the FAQ-o-matic, at
minimum:

- -do some DB_CONFIG tuning
- -set up OpenLDAP caching (cachesize, idlcachesize)
- -set up checkpointing, and consider running db_checkpoint (the correct
version) as the user slapd is running as via cron or similar
- -ensure database recovery will run if ever slapd dies unexpectedly
(power failure, hardware failure, OS failure, PECKAC)
- -deal with your transaction logs sanely
- -ensure you have some sane means of backups of your data in LDAP (either
snapshots of the database and transaction logs, or slapcat the data as
the user slapd runs as via cron or similar).

Also, I don't see the point of using ReiserFS for the filesystem holding
your LDAP data ... no small files (which is where ReiserFS would excel),
and any possible performance difference is certainly not worth the
peculiarities of reiserfs (IIRC flushing semantics differ, which could
be causing some of your problems). You would be better off using a
reliable FS and tuning your server correctly (and doing things like
setting logging via syslog to be asynchronous).

Regards,
Buchan

- --
Buchan Milne                              Systems Architect
Obsidian Systems                  http://www.obsidian.co.za
B.Eng          RHCE (803004789010797),LPIC-1 (LPI000074592)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFC+c/vrJK6UGDSBKcRAu2DAJ4xiwPLiE+UfgAZzcLC9vIJ9DLdjQCfZb8S
a8JpJ+DcBkjrpbf1xBzlMUI=
=ZWa/
-----END PGP SIGNATURE-----