[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Open LDAP and PostgreSQL or Interbase

To: openldap-software@OpenLDAP.org
Subject: Re: Open LDAP and PostgreSQL or Interbase
From: Mike Douglass <douglm@rpi.edu>
Date: Fri, 02 Jun 2000 10:54:07 -0400
In-reply-to: <200006012133.QAA22119@cougar.vuse.vanderbilt.edu>

At 04:33 PM 6/1/00 -0500, you wrote:

Can anyone give us pointers to documentation that describes how to
build a server that uses some other database manager as the backend?
Alternatively, does anyone know why we shouldn't be trying to do this?


So that's why he was leaving Conan O'Brian

Andy Richter, System Administrator
School of Engineering
Vanderbilt University

I don't think it can be put much better than the following message which appeared on this list. Perhaps this shoudl be in the faq and reposted automatically every 2 weeks.

>From owner-openldap-software@OpenLDAP.org Wed May 10 05:44:03 2000 Return-Path: <owner-openldap-software@OpenLDAP.org> Received: from galois.openldap.org (root@galois.openldap.org [204.152.186.51]) by mail.rpi.edu (8.9.3/8.9.3) with ESMTP id FAA811610 for <douglm@rpi.edu>; Wed, 10 May 2000 05:44:02 -0400 Received: from localhost (majordomo@localhost) by galois.openldap.org (8.10.0.Beta10/8.10.0.Beta10/OpenLDAP/Hub) with SMTP id e4A8vdT06042; Wed, 10 May 2000 08:57:39 GMT Received: by OpenLDAP.org (bulk_mailer v1.12); Wed, 10 May 2000 08:56:48 +0000 Received: from tierra.stl.es (IDENT:root@tierra.stl.es [195.235.83.3]) by galois.openldap.org (8.10.0.Beta10/8.10.0.Beta10/OpenLDAP/Hub) with ESMTP id e4A8uk405913 for <openldap-stable@OpenLDAP.org>; Wed, 10 May 2000 08:56:46 GMT Received: from zurbaran.stl.es (IDENT:root@zurbaran.stl.es [172.20.144.84]) by tierra.stl.es (8.9.1a/8.9.3) with ESMTP id KAA02399 for <openldap-stable@OpenLDAP.org>; Wed, 10 May 2000 10:56:44 +0200 Received: from stl.es (j-sanchez.stl.es [172.20.17.130]) by zurbaran.stl.es (8.10.0/8.10.0) with ESMTP id e4A8uhc18503 for <openldap-stable@OpenLDAP.org>; Wed, 10 May 2000 10:56:44 +0200 Message-ID: <3919241A.FCD541B7@stl.es> Date: Wed, 10 May 2000 10:55:54 +0200 From: Julio Sánchez Fernández <j_sanchez@stl.es> Organization: Poca X-Mailer: Mozilla 4.7 [en]C-STL/0.1 (Win95; I) X-Accept-Language: en MIME-Version: 1.0 CC: openldap-stable@OpenLDAP.org Subject: Re: Viability of OpenLDAP References: <20000505163551.12875.qmail@web3203.mail.yahoo.com> <871z3gep5d.fsf@dave.rudedog.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-openldap-software@OpenLDAP.org Priority: non-urgent X-Loop: OpenLDAP Precedence: bulk Comment: OpenLDAP "openldap-software" Mailing List <http://www.OpenLDAP.org/> List-Archive: <http://www.OpenLDAP.org/lists/openldap-software> List-Help: <mailto:openldap-software-request@OpenLDAP.org?body=help> (MLM help), <http://www.OpenLDAP.org/lists/> (List Information) List-Unsubscribe: <mailto:openldap-software-request@OpenLDAP.org?body=unsubscribe> X-UIDL: afcf60e4ae86302884564c09aad0c299

Dave Carrigan wrote:

> AFAIK netscape directory server uses berkeley db. You don't want an
> industrial strength database (i.e., one with transactions, triggers,
> stored procedures, relations, sql, etc.) for ldap. You want something
> that is relatively lightweight and is screaming fast on reads. Berkeley
> is both.

Let me add to that.  We are all confronted all the time with the choice
relational database vs. directory.  It is a hard choice and no simple
answer exists.

It is tempting to think that having a DB backend to the directory solves
all problems.  However, it is a pig.  This is because the data models are
*very* different.  So normalizing the directory data requires multiple
tables.

Think for a moment about the 'person' objectclass.  Its definition
requires attribute types 'objectclass', 'sn' and 'cn' and allows
attribute types 'userPassword', 'telephoneNumber', 'seeAlso' and
'description'.  All of these attributes are multivalued, so a
normalization requires putting each attribute type in a separate
table.

Now you have to decide on appropriate keys for those tables.  The
primary key might be a combination of the DN, but this becomes
rather inefficient on most database implementations.

The big problem now is that accessing data from one entry requires
seeking on different disk areas.  On some applications this may
be OK but in many applications performance suffers.

The only attribute types that can be put in the main table entry
are those that are mandatory and single-value.  You may add also
the optional single-valued attributes and set them to NULL or
something if not present.

But wait, the entry can have *multiple* objectclasses and they are
organized in an inheritance hierarchy.  An entry of objetclcass
'organizationalPerson' now has the attributes from 'person' plus
a few others and some formerly optional attribute types are now
mandatory.

What to do?  Should we have different tables for the different
objetclasses?  This way the person would have an entry on the
'person' table, another on 'organizationalPerson', etc.  Or
should we get rid of person and put everything on the second
table.

But what do we do with a filter like '(cn=*)' where cn is an
attribute type that appears in many, many objectclasses.  Should
we search all possible tables for matching entries?  Not very
attractive.

Once this point is reached, three approaches come to mind.  One is
to do full normalization so that each attribute type, no matter
what, has its own separate table.  The simplistic approach where
the DN is part of the primary key is extremely wasteful, and calls
for an approach where the entry has a unique numeric id that is
used instead for the keys and a main table that maps DNs to ids.
The approach, anyway, is very inefficient when several attribute
types from one or more entries are requested.  Such a database,
though cumbersomely, can be managed from SQL applications.

The second approach is to put the whole entry as a blob in a
table shared by all entries regardless of the objectclass and
have additional tables that act as indices for the first table.
Index tables are not database indices, but are fully managed
by the LDAP server-side implementation.  This is exactly the
approach used by the ldbm backend in slapd.  However, the
database becomes unusable from SQL.  And, thus, a fully fledged
database system provides little or no advantage.  The full
generality of the database is unneeded.  Much better to use
something light and fast, like Berkeley DB.  And it is cheap, too.

A completely different way to see this is to give up any hopes of
implementing the directory data model.  In this case, LDAP is used
as an access protocol to data that provides only superficially the
directory data model.  For instance, it may be read only or, where
updates are allowed, restrictions are applied, such as making
single-value attribute types that would allow for multiple values.
Or the impossibility to add new objectclasses to an existing entry
or remove one of those present.  The restrictions span the range
from allowed restrictions (that might be elsewhere the result of
access control) to outright violations of the data model.  It can be,
however, a method to provide LDAP access to preexisting data that is
used by other applications.  But in the understanding that we don't
really have a "directory".

Existing commercial LDAP server implementations that use a
relational database are either from the first kind or the third.
I don't know of any implementation that uses a relational database
to do inefficiently what BDB does efficiently.

So the problem is hard.

Julio

Mike Douglass		douglm@rpi.edu
Senior Systems Programmer	
Server Support Services	518 276 6780(voice) 2809 (fax)
Rensselaer Polytechnic Institute 110 8th Street, Troy, NY 12180

References:
- Open LDAP and PostgreSQL or Interbase
  - From: war@vuse.vanderbilt.edu

Prev by Date: Re: ldap_search base for multiple databases problem
Next by Date: Re: Multiple Master, one slave replica?
Index(es):
- Chronological
- Thread