[Date Prev][Date Next] [Chronological] [Thread] [Top]

back-bdb performance



> -----Original Message-----
> From: owner-openldap-devel@OpenLDAP.org
> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Ganesan R

> I have been watching your posts on performance comparisons with back-ldbm
> with interest. I think you mentioned that the last numbers you posted were
> bogus; any updates on them yet? Our requirement for OpenLDAP in production
> is atleast six months away so, I am keen on using back-bdb with
> transactions.

I'm still tweaking code and gathering data. Here are some (accurate) times
for using ldapadd to load 10000 entries into slapd (with no attribute
indexing):

back-ldbm, a couple days ago
ldadd 6.260u 0.870s 0:44.09 16.1%     0+0k 0+0io 975pf+0w
slapd 30.740u 6.100s 0:48.54 75.8%    0+0k 0+0io 4858pf+0w

back-bdb, a couple days ago
ldadd 6.400u 1.230s 6:34.97 1.9%      0+0k 0+0io 1561pf+0w
slapd 136.410u 223.560s 6:38.83 90.2% 0+0k 0+0io 3193pf+0w

back-bdb, with the newer entry_encode/decode routines
ldadd 6.600u 1.040s 7:02.39 1.8%      0+0k 0+0io 2354pf+0w
slapd 153.930u 245.990s 7:06.63 93.7% 0+0k 0+0io 3172pf+0w

back-bdb, with no root DN_SUBTREE index
ldadd 6.410u 1.020s 2:36.57 4.7%      0+0k 0+0io 2050pf+0w
slapd 59.720u 50.030s 2:39.74 68.7%   0+0k 0+0io 3107pf+0w

back-ldbm, with no root DN_SUBTREE index
ldadd 6.370u 0.750s 0:36.34 19.5%     0+0k 0+0io 265pf+0w
slapd 24.110u 5.400s 0:41.21 71.6%    0+0k 0+0io 4852pf+0w

For write operations, it's quite obvious that all of those index updates are
costing a lot in terms of CPU time and I/O operations. The cost arises from
the transaction logging. The actual volume of data that slapd is managing is
reasonably small; with my first entry_encode/decode routines the id2entry
database was around 20MB for the 10000 entries. The transaction logs
generated from loading those entries was over 1.5GB. With my current
entry_encode/decode routines, the id2entry database is now down to about
10MB, but the transaction logs were still over 1.2GB. After I eliminated the
DN_SUBTREE index for the backend's suffix, as you can see there was a
dramatic savings in time, and the transaction logs only totalled 520MB.

I'm currently investigating an alternate indexing layout in the hopes that I
can reduce the transaction cost even further. Ultimately I would like to
find a way to make the transaction cost effectively disappear. I have
another variant of back-bdb that uses a hierarchical data structure, thus
completely eliminating the dn2id database. I have just gotten it into
working order today, and run it successfully through the test suite. For the
same 10000 entries, this backend loads them in only 27 seconds.

back-hdb
ldadd 5.960u 0.800s 0:27.44 24.6%     0+0k 0+0io 265pf+0w
slapd 16.030u 2.270s 0:29.94 61.1%    0+0k 0+0io 2995pf+0w

The transaction logs for loading these 10000 entries amount to only 20.8MB.
The id2entry database and tree structure consume only 9.8MB.

Of course, this backend still uses the existing indexing layout, so as soon
as you turn on attribute indices the transaction overhead skyrockets again.
So I'm returning my attention to the index management for now...

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support