[Date Prev][Date Next] [Chronological] [Thread] [Top]
Re: slapd hangs up and uses 100% CPU (v2.1.12 release)

To: Jehan PROCACCIA <Jehan.Procaccia@int-evry.fr>
Subject: Re: slapd hangs up and uses 100% CPU (v2.1.12 release)
From: Igor Brezac <igor@ipass.net>
Date: Fri, 7 Feb 2003 12:13:39 -0500 (EST)
Cc: OpenLDAP-software@OpenLDAP.org, Kirill Ponazdyr <lists@codeangels.com>
In-reply-to: <3E43E4E2.8060506@int-evry.fr>
This solves the groups problem.  Other then groups acl, I have not
experienced slapd 100% CPU usage.

-Igor

On Fri, 7 Feb 2003, Jehan PROCACCIA wrote:

> Does this solve only the problems originating from group.c, so pb with
> groups I suppose ? or is it a general correction to all slapd 100% CPU
> usage pb ?
>
>
> Igor Brezac wrote:
> > For those of you interested, this issue is fixed in CVS.  Check out
> > http://www.OpenLDAP.org/its/index.cgi?findid=2195
> >
> > Thanks to Jonghyuk Choi!
> >
> > -Igor
> >
> > On Wed, 5 Feb 2003, Igor Brezac wrote:
> >
> >
> >>On Wed, 5 Feb 2003, Kirill Ponazdyr wrote:
> >>
> >>
> >>>Greetings,
> >>>
> >>>When i truss (Solaris?s strace) the hanging process I see a loop made of:
> >>>
> >>>/4:     yield()                                         = 0
> >>>
> >>>lines.
> >>>
> >>>I have tried to recompile bdb in its newest version, compile with -O
> >>>instead -O3 as I usually do, no change at all.
> >>>
> >>>I also tried to change file descriptor soft limit to 1024, systemwide. No
> >>>change again.
> >>>
> >>>Now I gaved up and went to ldbm on the base of gdbm, it works flawlessly.
> >>>It really seems to be a "exclusive" bdb backend problem.
> >>>
> >>
> >>I can reproduce this on Solaris 9.
> >>Check out http://www.OpenLDAP.org/its/index.cgi?findid=2195
> >>
> >>Do you use 'group' in your acls?
> >>
> >>-Igor
> >>
> >>
> >>>Regards
> >>>
> >>>Kirill
> >>>
> >>>
> >>>>I have the same pb, but cannot reproduct it as I want ...
> >>>>However I noticed that when I stoped playing with bdb tunning it worked
> >>>>better ... By playing with bdb, i mean using the cachesize and
> >>>>checkpoint directives in slapd.conf, if you put silly values, as I might
> >>>> have done, this will maybe trash slapd ... ? Since I put reseaonable
> >>>>values, now it seems to work fine .
> >>>>
> >>>>my slapd.conf
> >>>>
> >>>>#cachesize      6000
> >>>>checkpoint      100000 360
> >>>>#dbnosync
> >>>>
> >>>>and DB_Config file for my database
> >>>>
> >>>>$ cat /var/lib/ldap/int/DB_CONFIG
> >>>>#set the logfile size to 100MB.
> >>>>#set_lg_max 104857600
> >>>>#set the in-memory log buffer size
> >>>>set_lg_bsize 204800
> >>>>#temporary while we're slapadding the database
> >>>>set_flags DB_TXN_NOSYNC
> >>>>#set the (per db?) cachesize to 0GB + X bytes, split into N pieces of
> >>>>memory set_cachesize 0 5120000 2
> >>>>
> >>>>
> >>>>Although I still don't know which ones are used, slapd.conf directives
> >>>>or DB_CONFIG ones ??
> >>>>
> >>>>when slapd takes 100% , could you make a strace -p pid  (pid=pid of
> >>>>slapd at 100%) to check what is is actually doing. For me it was looping
> >>>> on something, can't remember what, but it's somewhere in the list .
> >>>>
> >>>>Let us know if you find an explanation.
> >>>>
> >>>>Thanks.
> >>>>
> >>>>Kirill Ponazdyr wrote:
> >>>>
> >>>>>Greetings,
> >>>>>
> >>>>>We have a problem with slapd hanging up and using 100% CPU time on our
> >>>>>machine when we try to do operations on a tree, it happens in random
> >>>>>places but we could find one where it happens every time, when we try
> >>>>>to delete a certain object in the tree. We can repro the problem as
> >>>>>many times as we wish. Unfortunately the slapd has to be killed by
> >>>>>kill -9 and this corrupts our databases, so we have to reload a
> >>>>>directory (PITA).
> >>>>>
> >>>>>Thus two questions: Why is this stuff happening ? and is there a way
> >>>>>to run a consistency check on BDB databases, thus not requiering the
> >>>>>full reload ?
> >>>>>
> >>>>>Here are release infos, configs and debug output:
> >>>>>
> >>>>>Releases:
> >>>>>-----------------------------------------
> >>>>>Openldap v2.1.12 release
> >>>>>Bdb libraries 4.1.24
> >>>>>Solaris 9 Sparc with latest patch cluster
> >>>>>
> >>>>>HW:
> >>>>>-----------------------------------------
> >>>>>Sun Netra T1125 with 1 Gig RAM.
> >>>>>
> >>>>>
> >>>>>DB_CONFIG
> >>>>>-------------------------------
> >>>>>set_lg_bsize 2097152
> >>>>>set_cachesize 0 209715200 2
> >>>>>
> >>>>>
> >>>>>slapd.conf:
> >>>>>--------------------------------------------------------------
> >>>>>include                 /etc/openldap/schema/core.schema
> >>>>>include                 /etc/openldap/schema/cosine.schema
> >>>>>include                 /etc/openldap/schema/nis.schema
> >>>>>include                 /etc/openldap/schema/qmail.schema
> >>>>>include                 /etc/openldap/schema/inetorgperson.schema
> >>>>>include                 /etc/openldap/schema/qmailControl.schema
> >>>>>pidfile                 /var/run/slapd.pid
> >>>>>argsfile                /var/run/slapd.args
> >>>>>disallow                bind_anon
> >>>>>allow                   bind_v2
> >>>>>
> >>>>>database                bdb
> >>>>>suffix                  "o=Codeangels, c=CH"
> >>>>>directory               /export/ldap-databases/codeangels
> >>>>>rootdn                  ** censored **
> >>>>>rootpw                  ** censored **
> >>>>>index                   cn,sn,uid pres,eq,approx,sub
> >>>>>index                   objectClass eq
> >>>>>... snip ....
> >>>>>
> >>>>>Debug:
> >>>>>---------------- snip -------------------
> >>>>>=> access_allowed: write access granted by write(=wrscx)
> >>>>>====> bdb_unlocked_cache_return_entry_r( 526 ): returned (0)
> >>>>>bdb_dn2entry_rw("cn=managers,ou=codeangels.com,ou=mail,ou=itaccounts,o=codeangels,c=ch")
> >>>>>=> bdb_dn2id_matched(
> >>>>>"cn=managers,ou=codeangels.com,ou=mail,ou=itaccounts,o=codeangels,c=ch"
> >>>>>) ====>
> >>>>>bdb_cache_find_entry_dn2id("cn=managers,ou=codeangels.com,ou=mail,ou=itaccounts,o=codeangels,c=ch"):
> >>>>>542 (1 tries)
> >>>>>bdb_cache_entry_db_lock: entry
> >>>>>cn=managers,ou=codeangels.com,ou=mail,ou=itaccounts,o=codeangels,c=ch,
> >>>>>rw 1, rc -30995 ====> bdb_cache_find_entry_id( 542 ): 542 (busy) 2
> >>>>>locker = -2147483031
> >>>>>bdb_cache_entry_db_lock: entry
> >>>>>cn=managers,ou=codeangels.com,ou=mail,ou=itaccounts,o=codeangels,c=ch,
> >>>>>rw 1, rc -30995 ====> bdb_cache_find_entry_id( 542 ): 542 (busy) 2
> >>>>>locker = -2147483031
> >>>>>bdb_cache_entry_db_lock: entry
> >>>>>cn=managers,ou=codeangels.com,ou=mail,ou=itaccounts,o=codeangels,c=ch,
> >>>>>rw 1, rc -30995 ====> bdb_cache_find_entry_id( 542 ): 542 (busy) 2
> >>>>>locker = -2147483031
> >>>>>.... repeat above 2 lines until killed ....
> >>>>>---------------- snip -------------------
> >>>>>
> >>>>>---
> >>>>>Kirill Ponazdyr
> >>>>>Technical Director
> >>>>>Codeangels Solutions
> >>>>>Tel: +41 (0)43 844 90 10
> >>>>>Fax: +41 (0)43 844 90 12
> >>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>
> >
>
>
>
>

-- 
Igor
References:
- Re: slapd hangs up and uses 100% CPU (v2.1.12 release)
  - From: Jehan PROCACCIA <Jehan.Procaccia@int-evry.fr>
Prev by Date: Re: Storeing Bookmarks?
Next by Date: Slapd deadlock on Solaris 7?
Index(es):
- Chronological
- Thread