[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Benchmarking

On Thu, 23 Mar 2000, Neil Hunter wrote:

> Thanks to all those contributing to the benchmarking topic, I thought I'd share
> my experiences as well. There's still some distance to go on our testing yet,
> including playing with a new solid-state disk system which should improve the
> build times a bit :)
> All the systems involved in our tests are PII or PIII's running RedHat 6.1 or
> 6.2 beta. The full dataset consists of around 3M entries and even though there
> isn't much data per entry it still amounts to over 1/2 Gb of LDIF.
> The tests so far have centred around a subset of this data (10K entries) and
> looking at the effects of caching and query types on performance. Xin's paper
> on LDAP performance covers similar ground to our testing and produces similar
> findings.

The number of entries does not matter as such, but the space of the 
directory occupys matters. It is constrained by the disk space.
Some of our tests used  
bigger size directory, but most of our experiments on
10K, since these experiments are not dependent on the directory size.
> For the current project, we only need a single equality index and build time
> for the small dataset is around 15s. (As yet I've no idea on the build time for
> the full dataset - it's been running for 2 days so far!)

Well, that is one of the reasons we were not using too big directory for
most of the experiments excpet for the directory size experiment. Imagine
one experiment needs one day to feed
the directory -;) And if we need 20 different directorys, and 
if some experiments fail, and we need to re-populate the directory.
Honestly, our measurement experiments took a couple of months
due to different overhead.

> We're more interested in the query performance of the SLAPD than the connection
> overheads at the moment, so the test script opens a connection to the SLAPD,
> binds and then carries out 5000 different queries before closing the
> connection. The time to execute this script is measured and an approximate
> number of queries/s can be calculated. (The scripts themselves are in Perl and
> use Net::LDAPapi to talk to the SLAPD. Through informal testing I've found
> Net::LDAPapi to perform better than the other Perl LDAP modules).

We did similar experiments, and the multiple retrieving saves the 
connection time, but not so much for the searching time. 

> Like Xin, we've looked at the effect of various cachesize and dbcachesize
> values vs query times. Performance definitely peaks if you can get the entire
> index into memory, but early tests have so far shown less dependence on the
> cachesize value. In fact larger values for cachesize appeared to adversely
> affect the performance.

Indeed. It is the memory really helps, not the cachesize.
Cachesize are more helpful for populating directory and 
caching directory into memeory.
Once it is larger than the memory size, the swap space is reduced, and 
adversely influences the performance.

> I've also informally tested GDBM vs the Berkely DB libraries. With the same
> dataset, SLAPD configuration and queries, GDBM appears to be slightly faster.
> It also seems to produce slightly smaller index files.

One of the reasons is that it saves memory.

> With the SLAPD running on a P3 500Mhz with 384Mb RAM and a RAID disk array and
> using P2 400MHz, 192Mb RAM clients I've had a sustained query rate of 155q/s
> per client. Remember that these queries are pretty lightweight operations -
> there's no connect/bind/disconnect operations per query which makes a huge
> difference.
> The interesting tests will be with the full dataset once it's built! At the
> moment it looks like the build actually slows down as it goes along - if anyone
> could explain the ldif2ldbm process in detail I'd be very interested!

Depend on where you put the directory.

> Performance for the full dataset is going to be largely IO limited due to it's
> size. The ldbm files currently occupy ~2Gb of space - and they aren't complete!
> I look forward to seeing what affect the solid-state drive has on perfomance :)
> Hope you find this information interesting/useful - if you have any questions,
> fire away.
> Cheers,
>   Neil
> -------------------------------------------------------------------------------
>   Neil Hunter                                     Tel:    +44 (0)113 234 6073
>   Internet Systems Developer                      Fax:    +44 (0)113 234 6065
>   Planet Online Limited                           Mobile: +44 (0)7787 100 649
> -------------------------------------------------------------------------------