[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: SLAP - memory leak ? (ITS#250)



Forwarded to the ITS...

Martin Hofbauer Bacher Systems EDV wrote:
> 
> The problem escalated this night, The Server has to be rebooted,
>  so I have to continue check/analyse this problem today !
> 
>  What I found out is that I use the "virtusertable" of sendmail as a LDAP search
> "(|(mail=sn.gn@xxx.at)(mailalternateaddress=sn.gn@xxx.at))"
> and that filter  produces heavy load .
> 
> checked with ldapsearch it increases also the heap, but
> not every time  !
> 
> After about 2 Hours of "normal" runnina after rebootg, the problem started
> again, but I was logged in and saw following errors:
> 
> Aug 14 13:44:49 mail sendmail[16788]: NAA16788: SYSERR(root): 421 Error in
> ldap_search_st using (|(mail=sn.gn@xxx.at)(mailalternateaddress=sn.gn@xxx.at))
> in map virtuser: Interrupted system call
> 
> After this error no such filter worked with "ldapsearch" any more
> only simple filters was ok like "mail=sn.gn@xxx.at"
> 
> slapd restart did not help !
> 
> What I have changed now:
> Now I am running sendmail with this simple filter and the
> systems need a lot fewer recources ( cpu : 0,3% - 0,5% instead of 5-10% )
> 
> seams stable now !
> 
> Any new ideas ( I did not installed the nothread version till now !) ?
> 
> Thank you
> 
> martin
> 
> On Fri, 13 Aug 1999, Kurt D. Zeilenga wrote:
> 
> > Martin Hofbauer Bacher Systems EDV wrote:
> > >
> > > On Fri, 13 Aug 1999, Kurt D. Zeilenga wrote:
> > >
> > > > At 08:43 AM 8/13/99 GMT, you wrote:
> > > > >slapd has to be restarted every few hours, because it grows very heavy in
> > > > >memory
> > > > >
> > > > >2800 entries
> > > >
> > > > Is it just cache growth?
> > >
> > > No, the size ( shown by "top" ) is stable for about 20 min.
> > > then "SIZE" grows for about 6MB it one step !
> >
> > It could be that the cache grew because a larger set of entries where
> > search for or with a filter that required additional indices to be loaded
> > (and cached).  It could be that some additional entries where added as well.
> > It could be that a temporary larger demand of was placed upon the server
> > (by more clients and/or more concurrent operations) causing the server to
> > demand more resources from the operating system.
> >
> > > > >What can I do to determine, why it grows ?
> >
> > First, you need a controlled environment.  (ie: control over exactly clients
> > interact with the server).  It is extermely difficult to diagnose problems
> > when the server is in general use.   Second, you need appropriate tools
> > to track heap usage.  These can be expensive.
> >
> > > > Start slapd, check size.
> > > > Do a search that matches all entries in your database, check size.
> > > > Repeat search, check size.
> > > > Repeat search, check size.
> > > > Do other operations, check size.
> > >
> > > No change of Mem. SIZE ( checked via "top" )
> >
> > That's a good sign.  Also, I noticed that your top shows what I assume
> > is a thread count.  You should watch this count as well.  We had huge
> > problems with LWP leaking resources (whole threads).  Pthreads (which,
> > I believe, set on top of LWP) seem to behave better, but...
> >
> > You might try --without-threads and see if you suffer similar heap
> > growth.  If --without-threads is stable, you might using an alternative
> > pthread implementation (GNU pth, FSU threads, or the like).
> >
> > > > You should see one large dump after doing the first search
> > > > as the cache is being prep'ed.
> >
> > You might try prep'd index caches as well.  That is, do a search that
> > matches all objects followed by searches that use each and every index
> > you have.  Include a one level search (for id2children) and a search
> > with "dn" filter.  This will give you a baseline to measure from.
> >
> > Note, you can easily grow beyond this baseline without adding/modify
> > entries... just kick off a couple dozen concurrent searches.
> >
> > > 2.) After my initial request for help for this issue
> > >         I got the "common errors" mail, also explaining
> > >         how to "./configure" under Solaris 2.x
> > >         That is different as I used ( see my init. bug report)
> >
> > There are a number of differnet approaches in getting configure to
> > accept pthreads under Solaris.  But they all generally lead to the
> > same set of libraries being linked in.
> >
> > > 3:) "sendmail" is the client, which uses this daemon most.
> > >         Due to the new relase (OpenLDAP 1.2.6) I decided
> > >         to use this version, but sendmail was compiled 2 weeks
> > >         before linked against OpenLDAP 1.2.4
> > >         ( the same with "pam_ldap" and "nss_ldap" )
> > >         Could this be a problem.
> >
> > The client SDK hasn't changed in quite some time.  Your clients should
> > be fine.
> >
> > > 4.) I have seen  ( can not remember where ) information of a memory leak BUG
> > >      after the DBM backend has been modified.
> >
> >
> > >         I seems so to me here in a similar case.
> > >         After we build the DBM backend from scratch ( "ldif2ldbm" )
> > >         the daemon run without problems for about 2 days.
> > >         ( But I didn't really checked the daemon very often )
> > >
> > >         should I switch to "gdbm" ?
> >
> > I don't recommend GDBM.   I do recommend Berkeley DB 2.7.5 (and nothing less).
> >
> > Kurt
> >
> 
> -------------------------------------------------------------------
> Martin Hofbauer                                       IT-Consulting
> phone : +43 (1) 60 126-34                   Bacher Systems EDV GmbH
> fax   : +43 (1) 60 126-4                         Wienerbergstr. 11B
> e-mail: mh@bacher.at                         A-1101 Vienna, Austria
> --