[Date Prev][Date Next] [Chronological] [Thread] [Top]
indexed search performance

To: openldap-devel@OpenLDAP.org
Subject: indexed search performance
From: "Jonghyuk Choi" <jongchoi@us.ibm.com>
Date: Fri, 21 Dec 2001 16:14:08 -0500
Importance: Normal
Sensitivity:
I compared back-bdd backend with a back-ldbm backend
using the DirectoryMark messaging scenario.
1) back-bdb with transactional data store
2) back-ldbm with a full entry cache (entries in cache = entries in
directory)
We ran the benchmark for 5 minutes after an initial ALLID search for cache
warming up.

- Test Environment
      Directory data
            64011 entries generated by DirectoryMark dbgen
      indexing
            objectClass : eq
            sn, description, seeAlso : eq
            cn : eq, sub
      server
            Pentium III 1GHz 1CPU, 512MB RAM
      client
            simulation of 8 clients each having 2 search threads
      test scenario
            messaging scenario : mimics MTAs searching for user IDs
      slapd backends under comparison : HEAD as of Dec 20, 2001.
            back-bdb  : with txn support
            back-ldbm : with a full entry cache

DirectoryMark messaging scenario assumes MTAs such as sendmail search for
user ids.
The scenario consists of exact match searches over ID fields (description
fields) of
inetOrgPerson (orgnizationalPerson). Total 8 clients were set up in this
test and each
of them have 2 concurrent threads in it. The directory has 64011 entries.
During the test, slapd automatically spawned 32 threads for search tasks.

The following table shows the dissection of average execution time
(real time) of one of the 32 threads for search operations.

                                 back-bdb-txn      back-ldbm-cache
Entire Search                            9887                  986
+ASN.1 Decoding                           364                  118
 -dn_normalize                            148                   45
 -get_filter                              128                   21
 -ber_scanf                                34                   22
 -select_backend                           20                    1
 -misc                                     36                   29
+Backend Search                          9423                  847
 +base entry retrieval                   3024                   24
  -dn2id(_matched)                       2716                   11
  -id2entry                               283                   10
 +candidate selection                    1073                  267
  +filter_candidate                      1063                  260
   +list_candidate                       1051                  254
    +list_candidate                                            230
    +equality_candidate                  1014                  105
     -key_read / intersection             922                   69
 +id2entry retrieval                      939                   15
  -db read                                879
  -cache_find_entry_id                                          10
 -test_filter                             144                   91
 -entry transmission                      855                  270
 -thread_yield                           3255                  105
 -status transmission                      32                   23


The above data represents the average time to perform search operations.
Each number is the time to perform corresponding sub-operations and is in
microseconds.
The first column shows such sub-operations in a hierarchical manner.

>From the numbers, we can see that the effect of caches is significant.
The average search time of back-ldbm-cache is 10 times lower than that of
back-bdb-txn.
The low base entry retrieval time of back-ldbm comes from the entry cache
keyed by DN,
while the low id2entry retrieval time comes from the entry cache keyed by
ID.
The performance advantage of back-ldbm in key_read seems to come from the
dbcache.

>From the above observations, it seems great to implement entry caches for
back-bdb
even in the case that entries are appropirately indexed.
I'll get back with more data with the DirectoryMark addressing scenario
soon.

------------------------
Jong Hyuk Choi
IBM Thomas J. Watson Research Center - Enterprise Linux Group
P. O. Box 218, Yorktown Heights, NY 10598
email: jongchoi@us.ibm.com
(phone) 914-945-3979    (fax) 914-945-4425   TL: 862-3979
Prev by Date: Re: anonymous and aci in attribute
Next by Date: Re: Thinking about operational attribute numSubordinates...
Index(es):
- Chronological
- Thread