Openldap occasional hunging


I am experiencing the following problem: Occasionally but consistently (about once a month), one of my ldap servers will hung - usually after some entry update. I find out the hung due to problems with the services which are dependent on ldap (mail, dns, see below).

The question is: Why does this happen, and how can I avoid it? All servers are production systems, and need maximum reliability.

When the problem occurs, I need to stop ldap service. If I try to start it again, I get:

   Checking configuration files for slapd:  bdb_db_open: unclean
   shutdown detected; attempting recovery.
   bdb_db_open: Recovery skipped in read-only mode. Run manual recovery
   if errors are encountered.

Then (as I have found from researching the problem), I must issue (as root):

   # service ldap stop
   # slapd_db_recover -v -h /var/lib/ldap
   # slapindex
   # chown ldap:ldap /var/lib/ldap/*
   # service ldap start
   Then, everything works fine again.

My setup is: one master (provider) openldap server (where all modifications take place) with 3 slaves (consumers), one of which is used with postfix/dovecot, and the other two with powerdns (ldap backend). Sync is being done with syncrepl, type=refreshAndPersist. Modifications are carried out using phpLDAPadmin (1.0.7), JXplorer, and other custom php scripts. The DIT is small (less than 2000 entries in total) and DIT changes are relatively few (max ~50 modifications per day, sometimes no modifications are made for several days).

The hung can occur to anyone of the ldap servers (either master or slave), with the same frequency (about once per month)! One of the servers is affected each time.

All servers run CentOS 5.5 with the latest openldap packages available: 2.3.43-12.el5_5.3.

   # slapd_db_upgrade -V
   Sleepycat Software: Berkeley DB 4.4.20: (January 10, 2006)

I wonder: Could such a database corruption problem occur in case:

  1. An entry is being modified at the same time as it is being read by
     another process?
  2. An entry is being modified "concurrently" by two different
     connected clients?

Could the problem be resolved with upgrading to a more recent openldap/bdb version? Can anyone suggest RPM (RHEL5 or Centos 5) packages to use safely (with TLS, cyrus-sasl, slapi, overlays=mod, plugins, modules) ?

