[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd stability problems with add/change operations

Adrian Gschwend wrote:
- I also asked in the bdb group because I first thought it's bdb (didn't
know the -o option in db_verify till then, see my thread here:

- slapd came back but it didn't took long till the next lock.

So I started to debug a bit more, I tried this:

- exported the bdb files with db_dump, reimported the stuff with db_load. This doesn't work at all, half of the OUs were missing and I couldn't find a single user anymore, even if the bdb files itself were about right in size (well, instead of 5.4MB the biggest file was 4.6MB). I am a bit confused that this doesn't work at all.
The documentation for OpenLDAP 2.2 states that slapcat/slapadd must be used for backup/restore. Since OpenLDAP 2.2 uses custom sort functions in its databases, the stock db_dump/db_load will not work on little-endian machines; using them will result in a corrupted database. Which is exactly what you got.
- slapcat the db to a file, slapadd it to a brand new db. Works for some
time but locked up quite fast again

- same game but I killed all entry*, creat* and modif* entries to be sure that we have a clean base. I almost thought it works like this because it was much more stable than it was before. We could do quite some add operations from the meta database like this, but it still locks from time to time. Overall it is definitely more stable however.

BTW our database is not that big, we have around 3000 entries which
shouldn't be a real problem for OpenLDAP I suppose.

Then I discovered this thread :)

We started with:
- OpenLDAP 2.2.15
- BDB 4.2.52
- FreeBSD 5.3

and I upgraded to:
- OpenLDAP 2.2.27

but same game.

This is getting nasty, as our whole directory depends on OpenLDAP. So I am more than happy to help to debug this stuff. But I'm not that skilled with gdb so if I should try to trace some stuff I need a bit more details about how to do that (or a link with some samples). Can I check where it hangs without debug version of slapd at all? Is it a good idea to start it with strace once (well, not performance wise for sure :)?
strace is mostly useless for debugging. Build slapd with debugging enabled, run with debugging enabled (-d4 may be enough) and see what's going on when it hangs. Most likely the BDB library has run out of resources, and the debug log will show if that's true. Since you didn't mention any DB_CONFIG settings, this is indeed the most likely cause.
Is there hope that this works better with OpenLDAP 2.3.x? Or should I
try another backend than bdb? What would make sense to try?
What would make sense is to read the documentation and then configure things properly.

 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sun        http://highlandsun.com/hyc
 OpenLDAP Core Team            http://www.openldap.org/project/