[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: openldap hangs on bdb / database corruption



Am Fr, den 11.06.2004 schrieb Ingo Steuwer um 12:38:
[..]
> > |>From time to time the database seems to hang, ldapsearch gets no answers
> > | and even db4.2_stat hangs at some point (needs kill -9 then). In this
> > | cases I need to stop slapd and do a db4.2_recover.
> > |
> > 
> > Looks like you might be exceeding 2GB of transaction logs which are in use.
> [..]
> > | relevant configruation-parts:
> > |
> > | slapd.conf:
> > | ---------------------------------------------------
> > | sizelimit               unlimited
> > | modulepath      /usr/lib/ldap
> > | moduleload      back_bdb.so
> > |
> > | database        bdb
> > |
> > | cachesize       500000
> > | index
> > | objectClass,uidNumber,gidNumber,memberUid,ou,uniqueMember pres,eq
> > | index           uid,cn,sn,givenName,mail,description,displayName
> > | pres,eq,sub
> > | index           sambaSID,sambaPrimaryGroupSID,sambaDomainName eq
> > | index           default sub
> > | ---------------------------------------------------
> > |
> > 
> > You don't appear to have a checkpoint setting, which would mean that all
> > your transaction logs are open (or something to that effect, based on
> > what I've seen).
> 
> That's true, I added "checkpoint 1024 1" now in slapd.conf. It may take
> some days to see if it is the reason.
> 
> But is it true that the log-files which contains only 10MB each are held
> open ? I thought this would be done only for the last=actual one, I
> mean, for what reason should thy all be open ?
> 
> Well, as I have 211 logfiles now they are bigger than 2G all together.
> 

It made it worse. As long as I have the checkpoint-Option set (I tried
also checkpoint 2048 2) I had a cpu-eating slapd for _each_ ldapmodify.
The modifications were done, but the correspondig slapd-process goes
mad. strace shows me the already meantioned sched_yield()-calls, but the
more mysterous is that such a process finishes after I try "ltrace -p"
on it.

Well, and after all it corrupted my database again. Back to the defaults
and after doing some changes there is one CPU-eating slapd which will
not finish with ltrace (it gives me countless
"ldap_pvt_thread_yield(0xbfffdf48, 0x40009e90, 0, 0x6044a510,
0x40117f60)              = 0") but will also resist an
"/etc/init.d/slapd stop" so I have to do a "kill -9". After a new start
I get "Implementation Specific Error"s on an ldapdelete.

So db_recover is needed again.

I'd be glad for any other hint or correction.

Thanks 
Ingo Steuwer