[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Backend databases -- what are the differences?



Most of the backends offer very diverse sets of features, so it's not
meaningful to directly compare them to each other.

Of the available databases, back-bdb and back-ldbm are the most full-featured
as primary database backends. These backends directly manage database files
for storing directory entries.

Back-ldap and back-meta are special purpose backends designed to proxy
requests between clients and other remote servers.

back-monitor is a status monitoring backend that gives operating statistics
on slapd itself.

back-null does nothing.

back-passwd is a piece of demonstration code whose main purpose is to
illustrate the backend interface. It happens to do this by mapping queries
onto /etc/passwd, somewhat like an LDAP version of finger.

back-perl, back-shell, and back-tcl are interfaces to external scripts
written in their respective languages. Obviously, since each of these
languages offer the ability to spawn external programs, these backends are
essentially interfaces to any kind of code you'd care to write in any
language of your choice.

back-dnssrv is a special purpose backend that maps search queries with DNs of
the form dc=foo,dc=com into DNS queries to return a URL for the LDAP server
that handles the specified DNS domain. It is essentially an LDAP server
locator.

back-perl, back-shell, and back-tcl are directly comparable in purpose and
function. The main difference is what language you prefer.

back-ldap and back-meta are directly comparable as back-meta is a proper
superset of back-ldap and back-ldap code is shared with back-meta.

back-bdb and back-ldbm are comparable in purpose and back-bdb evolved from
experience gained from back-ldbm, but the two are quite distinct today. They
both store entries based on a 32-bit entry ID key, and they use a dn2id table
to map from DNs to entry IDs. They both perform attribute indexing using the
same code, and store index data as lists of entry IDs. As such, the
LDAP-specific features they offer are nearly identical. The differences are
in the APIs used to implement the databases. back-ldbm uses a generic
database API that can plug into ndbm, gdbm, BDB, mdbm, or any other database
package that supports the (key,data) pair style of access from the original
Unix dbm library. While BerkeleyDB supports this generic interface, it also
offers a much richer API that has a lot more power and a lot more complexity.
back-bdb is written specifically for the full BDB API, and uses some of BDB's
more advanced features to offer transaction processing, fine grained locking,
and other features that offer improved concurrency and reliability.

With back-ldbm, there is no fine control of record locks that the database
uses. It pretty much uses whole-file locks on each database file, which is
why you cannot use slapcat/slapadd/slapindex while slapd is running. (And if
you somehow manage to make the tools bypass the lock mechanism, you will
assuredly corrupt the database by trying.)

With back-bdb, databases are locked on a page level, which means that
multiple threads (and processes) can operate on the databases concurrently.
In OpenLDAP 2.1.4 we completely lifted the restriction against using the slap
tools while slapd is running *on back-bdb*. You can perform online backups
using slapcat or BDB's db_dump utility without interrupting your LDAP
service. You can bulk add new entries using slapadd while the server is
running. (You can run slapindex too, but it generally will be a no-op, since
slapadd already does indexing.)

Using BDB's transaction logging means that every modification request is
logged in a separate log file before any database files are modified. If the
server crashes in the middle of an update, you can recover easily with no
data loss or corruption. Barring catastrophic disk hardware failures, when
the database returns "success" for an update operation, you know that the
update was completed cleanly on disk.

There are other differences between the two that are really only visible in
the code itself. back-ldbm stores entries in LDIF format, back-bdb stores
them in a binary format that is 3-4 times faster to read and write.
back-ldbm's index management is reminiscent of filesystem inodes, with direct
blocks and indirect blocks, and individual index blocks are malloc'd and
free'd on demand. back-bdb's index management is much simpler, and blocks are
malloc'd and free'd much less frequently, which again yields better
performance. (Unfortunately all of these optimizations aren't always visible
at the user level, because the BDB transaction subsystem adds considerable
overhead to back-bdb. For the most part the fanatical tuning efforts in
back-bdb only bring it on par with back-ldbm in terms of performance.)

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

> -----Original Message-----
> From: owner-openldap-software@OpenLDAP.org
> [mailto:owner-openldap-software@OpenLDAP.org]On Behalf Of Banzaitron
> Sent: Friday, September 06, 2002 3:27 PM
> To: openldap-software@OpenLDAP.org
> Subject: Backend databases -- what are the differences?
>
>
> I was wondering what advantages/disadvantages (if any) there are
> between the
> various LDAP backends (BDB, LDBM, LDAP, etc).  I am using BDB, but only
> because that is the default with the build files.  I didn't really see
> anything in the openldap website documentation that explained the
> differences between them, only how to specifiy which one you wanted.  Which
> one is being used by most people?
>
> Thanks,
> Andy
>
>
>