[Date Prev][Date Next]
indexed search performance
I compared back-bdd backend with a back-ldbm backend
using the DirectoryMark messaging scenario.
1) back-bdb with transactional data store
2) back-ldbm with a full entry cache (entries in cache = entries in
We ran the benchmark for 5 minutes after an initial ALLID search for cache
- Test Environment
64011 entries generated by DirectoryMark dbgen
objectClass : eq
sn, description, seeAlso : eq
cn : eq, sub
Pentium III 1GHz 1CPU, 512MB RAM
simulation of 8 clients each having 2 search threads
messaging scenario : mimics MTAs searching for user IDs
slapd backends under comparison : HEAD as of Dec 20, 2001.
back-bdb : with txn support
back-ldbm : with a full entry cache
DirectoryMark messaging scenario assumes MTAs such as sendmail search for
The scenario consists of exact match searches over ID fields (description
inetOrgPerson (orgnizationalPerson). Total 8 clients were set up in this
test and each
of them have 2 concurrent threads in it. The directory has 64011 entries.
During the test, slapd automatically spawned 32 threads for search tasks.
The following table shows the dissection of average execution time
(real time) of one of the 32 threads for search operations.
Entire Search 9887 986
+ASN.1 Decoding 364 118
-dn_normalize 148 45
-get_filter 128 21
-ber_scanf 34 22
-select_backend 20 1
-misc 36 29
+Backend Search 9423 847
+base entry retrieval 3024 24
-dn2id(_matched) 2716 11
-id2entry 283 10
+candidate selection 1073 267
+filter_candidate 1063 260
+list_candidate 1051 254
+equality_candidate 1014 105
-key_read / intersection 922 69
+id2entry retrieval 939 15
-db read 879
-test_filter 144 91
-entry transmission 855 270
-thread_yield 3255 105
-status transmission 32 23
The above data represents the average time to perform search operations.
Each number is the time to perform corresponding sub-operations and is in
The first column shows such sub-operations in a hierarchical manner.
>From the numbers, we can see that the effect of caches is significant.
The average search time of back-ldbm-cache is 10 times lower than that of
The low base entry retrieval time of back-ldbm comes from the entry cache
keyed by DN,
while the low id2entry retrieval time comes from the entry cache keyed by
The performance advantage of back-ldbm in key_read seems to come from the
>From the above observations, it seems great to implement entry caches for
even in the case that entries are appropirately indexed.
I'll get back with more data with the DirectoryMark addressing scenario
Jong Hyuk Choi
IBM Thomas J. Watson Research Center - Enterprise Linux Group
P. O. Box 218, Yorktown Heights, NY 10598
(phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979