[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: strange performance problem Update



Hello,

Did you apply the patch to 2.1.16 to tpool.c that Howard Chu discussed on this list recently?

--Quanah

--On Tuesday, April 01, 2003 10:01 AM +0200 xoror@infuse.org wrote:

Hi all,

The last 2 weeks i've tried various things to try to solve the strange
behaviour that i'm seeing. I'll try to give a short summary and the
results of them.

1. tried suggestions of the mailing list:
- disk problem -> can't be i got 5GB free hdd left
- Cache / memory prob -> i got 200 MB of free ram. (how does one set cache
parameter for bdb4? like the cache parameters for ldbm.)
- locking problem
db_* utilities provided some usefull information. i've made a log of
locking overviews during the testruns. But i don't see an explanation
(yet) for the weird behaviour.
- concurrency problems with my test programms.
 i've double checked the code and i'm pretty confident that there are no
concurrency problem. There's only 1 shared resource and access to it is
controlled with/by a mutex.

2. I've upgraded from bdb 4.1.24 to 4.1.25
this somehow helps a bit. With 4.1.25 i'm getting the weird behaviour
after 10 testruns instead of 4/5. Somehow the new bdb seems to 'fix' this
a bit (partially).

3. tried to get ldbm working with openldap 2.1.x
This failed, i don't know why exactly but my geuss is that it's a library
problem. (i got bdb 1, bdb 3 and bdb4 installed)

4. I installed openldap 2.0.27 on a new machine to test it with ldbm. The
results where very surprising. 2.0.27 with ldbm doesn't show the weird
behaviour. It also only used 20% cpu power max for 32 clients. i've run
the tests like 80 times now, and i'm pretty confident that this version
doesn't have the problem described below. The performance is quite good.
the avarage testrun time was about 6 seconds for 1000 task.

I'm suspection that there's somehow something wrong in either 2.1.x or in
bdb4 implementation. I'm doing this research as part of my master thesis.
I'm really want to get 2.1.x working correctly since this one should have
transactional capabilities (wich are important to the application i had
in mind). any hints on possible problem are welcome. i'll be glad to look
in to it.

for now i'm digging in the bdb documentations (and changelog to see what
they changed from .24 to .25 that can be related to my problem).

thank you for your time,
Cuong

On Sun, 23 Mar 2003, Peter Marschall wrote:

Hi,

have you only tred it with back-dbd od with back-ldbm too ?
Just to get an idea if it is more related to the core of slapd or to a
specific backend.

Peter



On Sunday 23 March 2003 14:47, Cuong bui wrote:
> Hello all,
>
> I'm doing some research on openldap to see if it's suitable for a
> certain kind of application. This application requires more write
> operations from the ldap server then an 'average' usage of ldap.
>
> For this purpose i've written an multithreaded application to simulate
> simultaneous access to the ldap server. This application creates an
> variable number of threads that simulate an client. There's a central
> queue of task that have to be performed. The queue has 3 kind of task
> (for now).
>
> 1. filter task. Simple search on a certain key
> 2. modify task. modification of a 'record'.
> 3. authentication task. (simple bind and unbind, simple authentication
> for now)
>
> The task queue is built in such a way that the type of tasks are mixed
> with eachother.
>
> i've generated a test dataset of 50.000 records and the .bdb files have
> been copied to an other location (ofcourse  when the server is
> offline) in order to restore data (instead of regenerating the test
> dataset, it can now be copied back).
>
> The testruns were very promising. for instance with 1000 tasks (300
> filter, 600 modification, 100 authentication) and 32 threads only took
> 9 seconds to complete. However performing this same test a few times
> gives me a rather surprise effect. the first 4 runs all took about
> 9-13 seconds to complete. The 5th run and up takes up to 5 mins (!) or
> more.  Simple searches with ldapsearch takes up a few mins (after 4
> testruns). (before that they only took a fraction of a second).
>
> during the first 4 testruns the openldapserver used 98% cpu processing
> time. (seen in top) After the initial 4 runs, slapd only consumes
> 40%-75% cpu processing time. somethimes it's even dropping to a few %.
>
> The only way to get slapd working 'normal' again, is to shutdown the
> server and restore the dataset. I have defined an index on  the field
> that i'm filtering on. (without it searches will take a very long time
> to complete).
>
> The test server consist of a pentium 4 2,53 ghz running gentoo linux
> with kernel 2.4.20 and ultra dma (enabled) hdd   with 512 MB ram.
> Openldap 2.1.16 was running on this machine. (but also tested with
> 2.1.12 and 2.1.15 they all gave this result) BDB was chosen as backend
> for this test.)
>
> The testclient consist of an pentium 4 1,8ghz with gentoo linux running
> kernel 2.4.19. The 2 systems were connected on a 100 mbit (switched)
> network. All the software were compiled with g++ 3.2.1
>
> Has anyone an explanation for this behaviour ? Is this a known problem
> ? All hints are welcome.
>
>
> Thank in advance (and for your time)
> Cuong
>
> btw: are there any documentation on implementations details of
> openldap ? if yes, where can i find them :) Other documentation that
> might explain what i'm seeing is also (very) welcome.

--
Peter Marschall
eMail: peter@adpm.de






-- Quanah Gibson-Mount Senior Systems Administrator ITSS/TSS/Computing Systems Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html