[Date Prev][Date Next] [Chronological] [Thread] [Top]

Relative merits of slapd backends



I'm curious whether anyone's had experience with using different hashing
backends (i.e. gdbm, Berkeley DB various versions, new mdbm) and can provide
comparisons/anecdotes in terms of memory usage, ability to handle concurrent
access, indexing speed, locking caveats, reliability etc.

Reliability is obviously important and has been a problem for Schlumberger
(and now Omnes/me ;-). We have OpenLDAP v1.0.2 on Solaris 2.5.1 with DB 1.85
deployed over several machines around the world serving a directory of about
60,000 entries. Corrupted indices seem to be a fact of life and typically
happen during data cleaning which involves a script running over a large
chunk of the directory and performing many updates. Is this most likely a
flaky back-end issue or slapd?

Reproducing indices is really very slow with the above setup: it can take
several hours on a Sparc Ultra3000 to create 30 or so indexes (each
typically eq,sub,pres)[1]. Would a different backend help (besides better
I/O and more RAM)? Several hours becomes expensive when an index becomes
corrupt.

The hard-nosed question at the moment for me is: is there a significant
benefit/evil consequence going with Berkeley DB v. GDBM (or some other
contender)?

Many thanks, Paul.

[1] This is actually a legacy phenomenon: apparently the old UofM code used
to dump core when generating indexes on the fly so it was less operationally
error-prone to pre-generate far more than was typically needed. I intend to
prune these back at some stage.