[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Slapd (OpenLDAP v2.3.11) Hangs Using 100% CPU Upon Start Up

--On Tuesday, November 15, 2005 7:25 AM -0800 Rik Herrin <rikherrin@yahoo.com> wrote:

   I had installed OpenLDAP and configured it using RHEL v4's Openldap
packages but then decided to use Buchan Milne's packages over at
http://anorien.csc.warwick.ac.uk/mirrors/buchan/openldap/ (OpenLDAP
v2.3.11 with RHEL v4's BDB package - 4.2.52-7.1) because I ran into
several database corruptions using RHEL v4's packages.  I successfully
installed the packages and migrated my data.  I then started OpenLDAP and
everything seemed to be fine.  A day latter, I found the OpenLDAP service
taking 100% of the CPU.  Stracing it returned that it was calling:
sched_yield() = 0

You are using buchan's OL build, but RedHat's BDB build? That doesn't sound wise.

 infinitely.  I tried the following but to no avail:
     1) Stopping and restart the server - as soon as I start the server,
top indicates that it's taking 99.9% of the CPU      2) I enabled logging
in /etc/sysconfig//etc/sysconfig/ldap2.3 but slapd seems to hang before
it does anything with the database.  Here are the debugging options that
I used by the log file remains empty:          SLAPD_OPTIONS="-d 1 -d 32
-d 64 -d 256"

These options all log to stdout, not to syslog. If you want entries to go into syslog, look at the "loglevel" directive.

     3) I read on a previous thread that it might have to do with my
DB_CONFIG options, so I tried changing them and even removing them.
Here's my DB_CONFIG file (I have about 500 entries in my LDAP server);
         set_lg_bsize 2097152
         set_lg_max              10485760
         set_cachesize   0       1048576        0
     4) I tried changing some parameters such as cachesize and
checkpoint.  They are currently set to:          checkpoint 64 15
         # check point whenever 64k data bytes written or
         # 15 minutes has elapsed whichever occurs first
         cachesize 6000
         I tried changing in cachesize and even removing it

Why would you set cachesize to 6000 when you only have 500 entries?

     5) I checked the permissions on all the files and directories
accessed by LDAP      6) I tried removing all my ACLs in case it was ACL
 Unfortunately, none of the above changed this behavior.  Any ideas?  My
current deployment is on RHEL v4 ES.  I found other people referring to
this issue
but their posts were old (2003) and the issue was fixed in CVS.
 Thank you for your time.

I suggest starting slapd from the command line with -d -1.


Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html