[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd stability problems with add/change operations



Hallo Adrian,

There were a number of patches published for BDB 4.2.52. Without these
patches berkeley env starts to hang if cachesize (that's what you set in
DB_CONFIG) is too small.

Although people here do not suggest to use BDB 4.3.X, I do use it and
don't have any headaches.

Try to do following: increase value of set_cachesize in DB_CONFIG to
about half of all your *.bdb files or otherwise what your system allows,
but don't leave it default.

Best regards, vadim tarassov.

On Thu, 2005-08-11 at 11:37 +0200, Adrian Gschwend wrote:
> Hi all,
> 
> This posting refers to an older posting this June by Steffen Hansen
> <http://article.gmane.org/gmane.network.openldap.general/29440>
> 
>  > We use OpenLDAP in the Kolab project, but after switching to the bdb
>  > backend there have been several reports about stability problems. Slapd
>  > sometimes seems to hang when someone tries to write to the database
>  > (for example with ldapadd).
> [...]
> 
> I spend the last three days debugging exactly the same problem:
> 
> - Sunday night slapd hung during an add operation (according to the log)
> from the meta-db. I love to start a week like this Monday morning :)
> 
> - killed with -9, redid the db with db_recover -c -v (-c was necessary
> obviously)
> 
> - I also asked in the bdb group because I first thought it's bdb (didn't
> know the -o option in db_verify till then, see my thread here:
> <http://groups.google.com/group/comp.databases.berkeley-db/browse_thread/thread/3d70acda54c7d3c6/fd31c234910588a1#fd31c234910588a1>
> 
> - slapd came back but it didn't took long till the next lock.
> 
> So I started to debug a bit more, I tried this:
> 
> - exported the bdb files with db_dump, reimported the stuff with 
> db_load. This doesn't work at all, half of the OUs were missing and I 
> couldn't find a single user anymore, even if the bdb files itself were 
> about right in size (well, instead of 5.4MB the biggest file was 4.6MB). 
> I am a bit confused that this doesn't work at all.
> 
> - slapcat the db to a file, slapadd it to a brand new db. Works for some
> time but locked up quite fast again
> 
> - same game but I killed all entry*, creat* and modif* entries to be 
> sure that we have a clean base. I almost thought it works like this 
> because it was much more stable than it was before. We could do quite 
> some add operations from the meta database like this, but it still locks 
> from time to time. Overall it is definitely more stable however.
> 
> BTW our database is not that big, we have around 3000 entries which
> shouldn't be a real problem for OpenLDAP I suppose.
> 
> Then I discovered this thread :)
> 
> We started with:
> - OpenLDAP 2.2.15
> - BDB 4.2.52
> - FreeBSD 5.3
> 
> and I upgraded to:
> - OpenLDAP 2.2.27
> 
> but same game.
> 
> This is getting nasty, as our whole directory depends on OpenLDAP. So I 
> am more than happy to help to debug this stuff. But I'm not that skilled 
> with gdb so if I should try to trace some stuff I need a bit more 
> details about how to do that (or a link with some samples). Can I check 
> where it hangs without debug version of slapd at all? Is it a good idea 
> to start it with strace once (well, not performance wise for sure :)?
> 
> Is there hope that this works better with OpenLDAP 2.3.x? Or should I
> try another backend than bdb? What would make sense to try?
> 
> thanks
> 
> Adrian
> 
-- 
vadim <vadim.tarassov@swissonline.ch>