OpenLDAP Faq-O-Matic: What are the different backends? What are their differences?

	OpenLDAP Faq-O-Matic : OpenLDAP Software FAQ : Installation : What are the different backends? What are their differences?
	OpenLDAP Software includes a number of backends which may be used with `slapd`(8). It may not be obvious which ones to use. Most of the backends offer very diverse sets of features, so it's not meaningful to directly compare them to each other. back-mdb is the "primary" storage database backend. This backend manages directory objects in an embedded database and is more fully featured than other backends. back-mdb is superior to the deprecated back-hdb and back-bdb backends. back-bdb and back-hdb are deprecated backends based on BerkeleyDB. They generally should no longer be used. A license change by Oracle has removed compatibility for BDB 6 and later releases with OpenLDAP software. back-ldap and back-meta are special purpose backends designed to forward (proxy) requests to other remote servers. back-ldif The LDIF backend to `slapd` is a basic storage backend that stores entries in text files in LDIF format, and exploits the filesystem to create the tree structure of the database. It is intended as a cheap, low performance easy to use backend, and it is exploited by higher-level internal structures to provide a permanent storage. It is primarily used for the cn=config online configuration database. back-monitor is a status monitoring backend that gives operating statistics on `slapd`(8) itself. back-null does nothing. It is the LDAP equivalent of `/dev/null`. back-passwd is a piece of demonstration code whose main purpose is to illustrate the backend interface. It happens to do this by mapping queries onto `/etc/passwd`, somewhat like an LDAP version of finger. back-perl and back-shell are interfaces to external scripts written in their respective languages. Obviously, since each of these languages offer the ability to spawn external programs, these backends are essentially interfaces to any kind of code you'd care to write in any language of your choice. back-shell is generally viewed as deprecated in favor of back-perl. back-perl is generally viewed as deprecated in favor of back-sock. back-relay backend is used to map a naming context defined in a database running in the same `slapd` instance into a virtual naming context, with attributeType and objectClass manipulation, if required. It requires the slapo-rwm overlay. back-sock uses an external program to handle queries, similar to slapd-shell. However, in this case the external program listens on a Unix domain socket. This makes it possible to have a pool of processes, which persist between requests. This allows multithreaded operation and a higher level of efficiency. This module may also be used as an overlay on top of some other database. Use as an overlay allows external actions to be triggered in response to operations on the main database. back-dnssrv is a special purpose backend that maps search queries with DNs of the form `dc=foo,dc=com` into DNS queries to return a URL for the LDAP server that handles the specified DNS domain. It is essentially an LDAP server locator. It is experimental in nature. See OpenLDAP LDAP Root Service for more information. back-sql is also a RDBMS backend, mapping LDAP queries into SQL queries. It is considered experimental. HISTORICAL INFORMATION back-bdb, back-hdb, and back-ldbm are the "primary" storage database backends. These backends manage directory objects in an embedded database and are more fully featured than other backends. back-hdb is generally superior to back-bdb (especially as back-hdb supports subtree renames) but tends to require larger caches than back-bdb. back-ldbm is obsolete and should not be used. back-ldap and back-meta are special purpose backends designed to forward (proxy) requests to other remote servers. back-monitor is a status monitoring backend that gives operating statistics on `slapd`(8) itself. back-null does nothing. It is the LDAP equivalent of `/dev/null`. back-passwd is a piece of demonstration code whose main purpose is to illustrate the backend interface. It happens to do this by mapping queries onto `/etc/passwd`, somewhat like an LDAP version of finger. back-perl and back-shell are interfaces to external scripts written in their respective languages. Obviously, since each of these languages offer the ability to spawn external programs, these backends are essentially interfaces to any kind of code you'd care to write in any language of your choice. back-shell is generally viewed as deprecated in favor of back-perl. back-dnssrv is a special purpose backend that maps search queries with DNs of the form `dc=foo,dc=com` into DNS queries to return a URL for the LDAP server that handles the specified DNS domain. It is essentially an LDAP server locator. It is experimental in nature. See OpenLDAP LDAP Root Service for more information. back-sql is also a RDBMS backend, mapping LDAP queries into SQL queries. It's still experimental in nature. back-perl and back-shell are directly comparable in purpose and function. However, as back-shell suffers from a number of limitations (doesn't support threads, is not extensible, etc.), back-perl is generally recommended over back-shell. back-ldap and back-meta are directly comparable as back-meta is a proper superset of back-ldap and back-ldap code is shared with back-meta. back-bdb, back-hdb and back-ldbm are comparable in purpose. back-bdb evolved from experience gained from back-ldbm, but the two are quite distinct today. back-hdb is a further refinement of back-bdb and most considerations for back-bdb apply equally to back-hdb. back-bdb and back-ldbm both store entries based on a 32-bit entry ID key, and they use a dn2id table to map from DNs to entry IDs. They both perform attribute indexing using the same code, and store index data as lists of entry IDs. As such, the LDAP-specific features they offer are nearly identical. The differences are in the APIs used to implement the databases. back-ldbm uses a generic database API that can plug into GDBM, MDBM, BerkeleyDB (BDB), or any other database package that supports the (key,data) pair style of access. While BerkeleyDB supports this generic interface, it also offers a much richer API that has a lot more power and a lot more complexity. back-bdb is written specifically for the Berkeley DB Transactional Data Store API. That is, back-bdb uses BDB's most advanced features to offer transactional consistency, durability, fine-grained locking, and other features that offer improved concurrency, reliability, and useability. With back-ldbm, there is no fine-grain database locking. This means write operations are serialized. And while multiple read operations may be performed concurrently, they cannot be performed concurrently with any write operation. Additionally, LDBM databases can be accessed by only one program at a time (generally at the file level). (While one may be able to bypass the locking mechanism, you will likely corrupt the database (and/or obtain bogus information).) With back-bdb, databases are locked on a page level, which means that multiple threads (and processes) can operate on the databases concurrently. In OpenLDAP 2.1.4 we lifted the restriction against using the slap tools while `slapd` is running on back-bdb. You can perform online backups using slapcat or BDB's `db_dump` utility without interrupting your LDAP service. You still must not use slapadd or slapindex while slapd is running (due to application-level caching in slapd(8)). Note that the alock feature added in OpenLDAP 2.3 automatically prevents slapadd or slapindex from being used while slapd is running. Using BDB's transaction logging means that every modification request is logged in a separate log file before any database files are modified. If the server crashes in the middle of an update, you can recover easily with no data loss or corruption. Barring catastrophic disk hardware failures, when the database returns "success" for an update operation, you know that the update was completed cleanly on disk. There are many other differences between the two that are really only visible in the code itself. For example, back-ldbm stores entries in LDIF format, and back-bdb stores them in a binary format that is 3-4 times faster to read and write. back-ldbm's index management is reminiscent of filesystem inodes, with direct blocks and indirect blocks, and individual index blocks are malloc'd and free'd on demand. back-bdb's index management is much simpler, and blocks are malloc'd and free'd much less frequently, which again yields better performance. As a historical note, the back-ldbm code is a direct descendant of the original University of Michigan code. The age of the code and its byzantine data structures were becoming unmaintainable, and since back-bdb has proven itself to be more reliable, the decision was made to delete back-ldbm from the code base. hyc@openldap.org, Kurt@OpenLDAP.org, quanah@openldap.org
	HISTORICAL Perhaps some anecdotal information may help people see the difference between ldbm and bdb. We were happily using ldbm as a backend at Columbia, and getting searches responses in about .03 seconds. However, at various points in the day a program would make a large number (300+) of add/modifies to our OpenLDAP server and search times would suddenly jump to 3 or 4 seconds. After switching to bdb things improved drastically. Even with the same large amounts of add/modifies occurring, search times only increase to about .04 seconds. phr2101@columbia.edu, quanah@openldap.org
	HISTORICAL As a further testimony to limitations of ldbm/gdbm, the ldbm/gdbm combination also has a 2GB filesize limitation that can leave one with a corrupted directory! If gdbm attempts to write any file past the 2 GB filesize (2102410241024 = 2147483648 bytes), it will abort and die, which will then cause slapd to die. The symptoms are that a file is 2147483647 (2 GB - 1) bytes and slapd runs for a little while but dies when a write is attempted. There is nothing in the log file and nothing prints out unless you run slapd with the -d option so that it doesn't fork. Then you see the gdbm error saying that it was unable to write to the file (but it doesn't tell you which one). If you look at the end of the 2 GB file, for example 'tail id2entry.gdbm', you'll see that it probably was interrupted in the middle of the write, so now it's corrupted also. Any directory with reads and writes will have gdbm files typically much larger than the amount of data they contain due to the "sparse files" design of gdbm. In my specific case, a 2 GB id2entry.gdbm shrunk down to 32 Megs when it was restored. Since my corrupted file was the id2entry.gdbm file, it was most severe because this file is the main data store, all of the other .gdbm files are indexes. Since that main file was corrupted, I could not recover my data with 100% certainty. How to restore these files: In my specific case, I had slave ldap servers, so I had a copy of the directory before the corruption occurred. This is because the slapd dying on the master also prevented it from writing the info to the slurpd replog, so the data never replicated out to the slaves. I performed the following steps: 1) stop slurpd on the master (slapd had already died), 2) stop slapd on one of the slaves, 3) slapcat to an ldif file, 4) rsync the ldif over to the corrupted master, 5) save a copy of the corrupted directory db files, 6) delete the corrupted directory db files, 7) slapadd the ldif file (which creates new directory db files), 8) change ownership to user slapd runs as (ldap:ldap in my case), 9) delete the replog (replication) files, 10) stop slapd on all slaves, 11) rsync the new files out to the slaves, 12) restart the slapd daemon on master, 13) restart slapd daemons on slaves, 14) restart slurpd daemon on master. Unfortunately, this is only a quick fix. The root problem is that my directory db files could grow beyond a point where performing the above steps can fix it. The correct fix is to convert to a db that doesn't have this limitation. The correct fix is to convert to Berkeley DB, preferrably 4.2.52 with patches (see the mailing list archives). But the above steps will get you out of a tight spot and give you enough breathing room to get the OpenLDAP server back running and give you time to plan a migration. tlyons@ivenue.com, quanah@openldap.org
	[Append to This Answer]

Previous:	Are third party thread packages supported?
Next:	Which version of BerkeleyDB should I use?

This document is: http://www.openldap.org/faq/index.cgi?file=756

	[Search]	[Appearance]
This is a Faq-O-Matic 2.721.test.