Re: Slapd (OpenLDAP v2.3.11) Hangs Using 100% CPU Upon Start Up

Aaron, Quanah,
     Thanks for your input.  It turned out to be a bdb issue.  However, I was under the assumption that OpenLDAP v2.3.x automatically ran db_recover by putting it in /etc/init.d/ldap2.3.   When I ran slapcat2.3 to see if the DB was okay (I know that there are other ways, but slapcat2.3 was the quickest thing since I thought that db_recover had already been run), it seemed to work and then it told me that it was trying to fix the database because it wasn't in a good state. CPU Utilization went up to 100% and stracing it showed that it was in an infinite loop of sched_yield() = 0 as well
     The problem was database corrupt :( I thought that the initialization scripts were supposed to fix it. Both slapcat2.3 and slapd2.3 didn't recover my database although a simple db_recover -vv in /var/lib/ldap2.3 fixed it up.  When looking at the /etc/init.d/ldap2.3 file, I see that it tries to call /usr/bin/slapd_db_recover2.3 beforce calling db_recover. Perhaps this is the source of the problem. What are the differences between the 2?  Also, is this considered a bug since the start up script should recover the database if it needs recovering?
     I set the cachesize to 6000 as it will likely grow.  Are there any bad side effects?  I thought that it was better to have it larger than the number of entries than just stick to the exact number of entries.
     Finally, I'd love to use Buchan's BDB build.  But it wasn't included with the Mandriva packages that he posted.  Is it posted elsewhere?
 Thanks for your valuable input.

Quanah Gibson-Mount <quanah@stanford.edu> wrote: 

--On Tuesday, November 15, 2005 7:25 AM -0800 Rik Herrin 

>  Hi,
>    I had installed OpenLDAP and configured it using RHEL v4's Openldap
> packages but then decided to use Buchan Milne's packages over at
> http://anorien.csc.warwick.ac.uk/mirrors/buchan/openldap/ (OpenLDAP
> v2.3.11 with RHEL v4's BDB package - 4.2.52-7.1) because I ran into
> several database corruptions using RHEL v4's packages.  I successfully
> installed the packages and migrated my data.  I then started OpenLDAP and
> everything seemed to be fine.  A day latter, I found the OpenLDAP service
> taking 100% of the CPU.  Stracing it returned that it was calling:
> sched_yield() = 0

You are using buchan's OL build, but RedHat's BDB build?  That doesn't 
sound wise.

>  infinitely.  I tried the following but to no avail:
>      1) Stopping and restart the server - as soon as I start the server,
> top indicates that it's taking 99.9% of the CPU      2) I enabled logging
> in /etc/sysconfig//etc/sysconfig/ldap2.3 but slapd seems to hang before
> it does anything with the database.  Here are the debugging options that
> I used by the log file remains empty:          SLAPD_OPTIONS="-d 1 -d 32
> -d 64 -d 256"

These options all log to stdout, not to syslog.  If you want entries to go 
into syslog, look at the "loglevel" directive.

>      3) I read on a previous thread that it might have to do with my
> DB_CONFIG options, so I tried changing them and even removing them.
> Here's my DB_CONFIG file (I have about 500 entries in my LDAP server);
> /var/lib/ldap2.3/DB_CONFIG
>          set_lg_bsize 2097152
>          set_lg_max              10485760
>          set_cachesize   0       1048576        0
>      4) I tried changing some parameters such as cachesize and
> checkpoint.  They are currently set to:          checkpoint 64 15
>          # check point whenever 64k data bytes written or
>          # 15 minutes has elapsed whichever occurs first
>          cachesize 6000
>          I tried changing in cachesize and even removing it

Why would you set cachesize to 6000 when you only have 500 entries?

>      5) I checked the permissions on all the files and directories
> accessed by LDAP      6) I tried removing all my ACLs in case it was ACL
> related
>  Unfortunately, none of the above changed this behavior.  Any ideas?  My
> current deployment is on RHEL v4 ES.  I found other people referring to
> this issue
> (http://www.openldap.org/lists/openldap-software/200302/msg00213.html)
> but their posts were old (2003) and the issue was fixed in CVS.
>  Thank you for your time.

I suggest starting slapd from the command line with -d -1.


Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

