[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Mysterious slow-down of slapd with hdb

The information you provided still isn't specific enough, but I think you will need to use some lower level tools like oprofile to identify the problem. It could be your kernel, it could be glibc, it could be the malloc library, it could be the BerkeleyDB library, etc.

Johan Jönemo wrote:

I am trying to use openLDAP to hold a small but continuously rebuilt
database with a hdb backend. Basically I build a directory under a
temporary node and move it into place when its ready (hence the hdb, I
want to move the hole tree into place in one go). I build a new
directory, move out the current to an other temporary node and move in
the new one. Lastly I delete the defunct tree and start over building a
new tree. In short the idea is to always (except I suppose between
moving the tree out and the new in, but I don't see any solution for
that) have a complete tree in place while continuously trying to have it

This thing works well for a couple of hours on the machine I am running
it (PIII 1 cpu, 1000MHz, 512 Mb ram, linux 2.6 kernel), but then slows
down by a factor 10-20.

Why is this and what can I do to stop it? (easy to ask...)

free shows:
# free
             total       used       free     shared    buffers     cached
Mem:        508104     501992       6112          0      31268     332556
-/+ buffers/cache:     138168     369936
Swap:      2104504       2904    2101600

This isn't brilliant of course but AFAIU not catastrophic either. I have
about the same when it isn't slowed down. vmstat with a sampling rate of
a few seconds show no swapping before or after slapd slows down.

top shows that slapd and the script populating it runs at about 2-3%
each and not much cpu consumption apart from that (consistent with a
system that slows down a factor 20 I guess). The script uses Net::LDAP
in perl (over local socket) so no external clients are invoked.

The really puzzling bit is that if a shut down slapd and the "directory
builder" - thus reclaiming memory and filedescriptors and such to the
system - and then restart them I almost immediately get the same slow
down. In fact, the time it takes to get the computer to slow down after
firing up slapd seems proportional to how long I let it "rest".

I have tried all sorts of things to analyze this and finally decided to
profile slapd. I rebuilt it with -g -pg in CFLAGS and --enable-debug to
configure (actually I have used that switch all along). I also
discovered that I had to replace 'strip = -s' with 'strip =' in all
makefiles even though --enable-debug was given (is this intentional or a
bug in configure?). Finally I had to get the gprof-helper (and confirm
that it was used) by Hocevar/Jönsson to be able to profile threaded
applications. The result doesn't however tell me much. The slapd process
seems to spend most (70-80%) of its time in the "at_next" routine.

Info about system:
I am running v2.3.24 of slapd.
built with:
./configure --program-prefix=jj4 --with-threads=yes --enable-dynamic
--enable-debug --enable-crypt --enable-lmpasswd --enable-spasswd --enable-
modules --enable-backends=mod --enable-sql=no --enable-ldap=mod
--enable-meta=mod --enable-monitor=mod --enable-null=mod
--enable-perl=no --ena
ble-relay=mod --enable-shell=mod --enable-overlays=mod
--enable-denyop=mod --enable-dyngroup=mod --enable-dynlist=mod
--enable-lastmod=mod --enable-proxycache=mod --enable-retcode=mod
--enable-rwm=mod --enable-dependency-tracking

(lots of modules are built but only hbd-backend is actually loaded when
I'm running)

These are the relevant and nonsensitive parts of the slapd.conf:

sizelimit 1000000
moduleload      back_hdb.la

database        hdb

suffix          *removed*
rootdn          *removed*
rootpw          *removed*
directory       /usr/local/lis/var/db
checkpoint      512 5
dbconfig set_cachesize 0 16777216 8
dbconfig set_lg_regionmax 262144
dbconfig set_lg_bsize 2097152
dbconfig set_lg_max 16777216
dbconfig set_flags DB_LOG_AUTOREMOVE
index objectClass eq

I use dirtyread because I want to be able to read while I'm writing
(which is almost always) while an occasional bad read is acceptable (it
should anyway be very rare since I don't do the modifications in the
"current" tree where I read).

Thanks in advance

Johan Jönemo

  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  Chief Architect, OpenLDAP     http://www.openldap.org/project/