[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: 8 hours tests ends with inconsistent DB.



You are of course right Quanah. ldapadd is surley not the right tool for adding so much entries.
But this way has two advantages. First every application we use from day to day use
the same mechanism and second I could "simulate" what happens to OpenLDAP when
it run's for days or month without restarting. I think adding an entry is surley a "heavy"
operation compared with the other one's.
I'm now relatively sure that it is a memory leak. I've done the same test today again and
after about 1 1/2 h. the SIZE of slapd was nearly 3 GB and RSS nearly 1 GB (which was
expected because of the kernel resources configured) according to "top". This time I only
changed the kernel settings:


/proc/sys/kernel/msgmax
8192
/proc/sys/kernel/msgmnb
16384
/proc/sys/kernel/msgmni
1024
/proc/sys/kernel/shmall
1073741824
/proc/sys/kernel/shmmax
1073741824
/proc/sys/kernel/shmmni
4096
/proc/sys/fs/file-max
209715

Theese settings are good settings for a Oracle 10g DB which could handle quite a lot
queries on a FSC RX300 so I'm sure it must be enough for a simple ldapadd which
add's a lot of enties granted, but nothing impossible since it's the only command running.
So I will use this kernel settings for my further tests.


What's interessing to note is that after sending slapd a kill -HUP the SIZE (according to
top) went down slowly from 3 GB to 2,5 GB but the I/O was very high. After not going
down anymore after 30 min. I killed slapd with kill -6.


I must verify tomorrow if the same happens when using slapadd. With the help of one
of our programmers we tried to profile/debug slapd with memprof and valgrind but
OpenLDAP didn't like that. It started but didn't opend a port on 389. When I try to
attach with strace -p <PID> to slapd I get only one line of output (while there is still
a lot I/O). Btw. before profiling we build a debug enabled version of BDB 4.2.52.2
and OpenLDAP 2.2.13.


I've never used gdb. So I don't know how to set breakpoints. Our developer means
that this wouldn't help very much since it's only a malloc() that fails and we don't know
if the malloc that fails is responsable for the leak. Maybe we will try to profile slapd
tomorrow again. Today wasn't that much time.


Maybe another thing which could be interesting. With "ulimit -v" I forced slapd to
use only 400 Mbyte of virtual memory. I reduced also the cachesize in DB_CONFIG:
set_cachesize 0 104857600 0
and cachesize in slapd.conf from 10000 to 100. The result was the same only that
the memory errors described appeared earlier.
That's for now.


Cheers,
Robert


Quanah Gibson-Mount wrote:



--On Sunday, June 13, 2004 3:43 PM +0200 RW <openldap@tauceti.net> wrote:


For the problems I mentioned above it now really seem's to be my own
fault. For the case of the DB corruptions I could now reproduce it. In
this case I'm loading 500.000 entries with ldapadd into the directory. An
entry consists of about 23 attributes. After about 440.000 entries I
get the following messeages:


If you load the 500,000 entries via slapadd, do you see the same thing? ldapadd as a process to add large amounts of entries isn't generally advisable. Perhaps there is a memory leak somewhere. Have you used gdb to set a breakpoint when this error occurs, and do a backtrace? Some profiling of the running process to see its memory usage? It is hard to tell here if the issue is in OpenLDAP or BDB. Although I could see a memory leak causing a loss of resources...

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html