[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Backends for Reliability



Ian A. Tegebo wrote:
My shop is mostly running PostgreSQL with perl scripts that access our
data.  We got more data and users over time and had a hard time keeping
up with load, data consistency, and interoperability with other
software.

In comes OpenLDAP. Great. LDAP can replicate our data and provide data
consistency and interoperability, but crashes with the back-bdb had left
us with data loss; granted, I did not employ all of Sleepycat's recovery
mechanisms but that left our management with some doubts about its
reliability.

Kurt already responded to this point. I'll note that Symas has customers using back-bdb since its release in 2002 with zero data loss. All it takes is actually reading the documentation.


But I don't think there's a good solution for using a RDBMS(PostgreSQL) as your universal backend. Both items found in my research aren't really
appropriate.

In general RDBMS and LDAP are at such crossed purposes there really is no good solution from an architectural standpoint. Throw performance into the requirements and it's a non-starter.


			Main Goal
We want to be able to centralize our data in a cluster and then
distribute and access it with slapd.
			---------

With MySQL or PostgreSQL we could do the first part, but then not easily be able to access it from slapd. Of course, going the other way is easy. If we centralized all of our data in slapd, we'd be even more freaked out about data loss as we would create a single point of failure at the master slapd's backend.

In comes Berkeley DB's High Availability product.  This looks fantastic
in terms of our Main Goal:

http://www.sleepycat.com/docs/ref/rep/intro.html

Unfortunately:

http://www.openldap.org/lists/openldap-software/200402/msg00666.html

And this is the an argument I've been putting off; I contend that slapd
replication is a good thing, but that there is a compelling reason to
include Berkeley DB replication.  If I rely solely on slapd for
replication I can run into trouble.

(Please correct my understanding of replication mechanisms.)

Note that BDB's replication mechanism is also single-master, so it offers no advantage over OpenLDAP's existing mechanisms. And BDB's replication is all-inclusive; as such it does not support fractional or partial replication, and it precludes any differences in indexing between master and slaves.


Right now, only masters/providers can modify their backends from client
requests. So if the master/provider goes down, the slaves/consumers may
have the data, but they cannot accept updates nor forward requests for
writes. I cannot think of a way that slaves/consumers could failover to
another master/provider to allow updates to happen; and if that did,
you've created the problem of having to sync/rebuild the provider when
it comes back up.

It's quite simple to enable the multimaster code in slapd for automatic failover. With syncrepl a server that stopped will automatically resync itself when it restarts.


And then there's back-perl. I thought that for a moment I wouldn't have
to deal with the whole objectClass->table mapping problem by finding some
perl module to do the job for me and then using perl's DBI module to
make the final connection to PostgreSQL. Sorry dude. Again, let me know if
you've got something. I briefly tried to find some sort of repository
of known LDAP schema->SQL translations I could just load, no dice.

Ultimately, the only way to use an RDBMS for LDAP is not to. I.e., if you forego a lot of the relational capabilities and just use it as an elementary data store, you can make it work, but you've gained no advantage at all and lost a lot of performance by trying.


--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sun        http://highlandsun.com/hyc
 OpenLDAP Core Team            http://www.openldap.org/project/