[Date Prev][Date Next] [Chronological] [Thread] [Top]

Benchmarking



Thanks to all those contributing to the benchmarking topic, I thought I'd share
my experiences as well. There's still some distance to go on our testing yet,
including playing with a new solid-state disk system which should improve the
build times a bit :)

All the systems involved in our tests are PII or PIII's running RedHat 6.1 or
6.2 beta. The full dataset consists of around 3M entries and even though there
isn't much data per entry it still amounts to over 1/2 Gb of LDIF.

The tests so far have centred around a subset of this data (10K entries) and
looking at the effects of caching and query types on performance. Xin's paper
on LDAP performance covers similar ground to our testing and produces similar
findings.

For the current project, we only need a single equality index and build time
for the small dataset is around 15s. (As yet I've no idea on the build time for
the full dataset - it's been running for 2 days so far!)

We're more interested in the query performance of the SLAPD than the connection
overheads at the moment, so the test script opens a connection to the SLAPD,
binds and then carries out 5000 different queries before closing the
connection. The time to execute this script is measured and an approximate
number of queries/s can be calculated. (The scripts themselves are in Perl and
use Net::LDAPapi to talk to the SLAPD. Through informal testing I've found
Net::LDAPapi to perform better than the other Perl LDAP modules).

Like Xin, we've looked at the effect of various cachesize and dbcachesize
values vs query times. Performance definitely peaks if you can get the entire
index into memory, but early tests have so far shown less dependence on the
cachesize value. In fact larger values for cachesize appeared to adversely
affect the performance.

I've also informally tested GDBM vs the Berkely DB libraries. With the same
dataset, SLAPD configuration and queries, GDBM appears to be slightly faster.
It also seems to produce slightly smaller index files.

With the SLAPD running on a P3 500Mhz with 384Mb RAM and a RAID disk array and
using P2 400MHz, 192Mb RAM clients I've had a sustained query rate of 155q/s
per client. Remember that these queries are pretty lightweight operations -
there's no connect/bind/disconnect operations per query which makes a huge
difference.

The interesting tests will be with the full dataset once it's built! At the
moment it looks like the build actually slows down as it goes along - if anyone
could explain the ldif2ldbm process in detail I'd be very interested!

Performance for the full dataset is going to be largely IO limited due to it's
size. The ldbm files currently occupy ~2Gb of space - and they aren't complete!
I look forward to seeing what affect the solid-state drive has on perfomance :)

Hope you find this information interesting/useful - if you have any questions,
fire away.

Cheers,
  Neil

-------------------------------------------------------------------------------
  Neil Hunter                                     Tel:    +44 (0)113 234 6073
  Internet Systems Developer                      Fax:    +44 (0)113 234 6065
  Planet Online Limited                           Mobile: +44 (0)7787 100 649
-------------------------------------------------------------------------------