[Date Prev][Date Next]
Backends for Reliability
My shop is mostly running PostgreSQL with perl scripts that access our
data. We got more data and users over time and had a hard time keeping
up with load, data consistency, and interoperability with other
In comes OpenLDAP. Great. LDAP can replicate our data and provide data
consistency and interoperability, but crashes with the back-bdb had left
us with data loss; granted, I did not employ all of Sleepycat's recovery
mechanisms but that left our management with some doubts about its
We started toying around with the idea of using back-sql with
PostgreSQL. The list has pointed out, along with the man pages, that
back-sql was not designed as a universal backend, but for providing
access to data already in a RDBMS.
If I understand correctly, after playing with the test setup from the
howto's, everytime we want to add new kinds of entries, e.g. storing
user data and then wanting to store SSL certificates, we'd need to
create tables, metadata, and then access functions. Please correct
me if I'm wrong.
Considering how we want to keep adding to the number and types of data
stored in our directory, and do not want to hire a full-time DBA, this
does not seem at all reasonable. I thought about looking for some kind
of Schema->SQL translator that would simplify this process. My efforts
did not yield much: SQLFairy, ruby-ldapserver, and ???
But I don't think there's a good solution for using a RDBMS(PostgreSQL) as
your universal backend. Both items found in my research aren't really
We want to be able to centralize our data in a cluster and then
distribute and access it with slapd.
With MySQL or PostgreSQL we could do the first part, but then not
easily be able to access it from slapd. Of course, going the other
way is easy. If we centralized all of our data in slapd, we'd be
even more freaked out about data loss as we would create a single
point of failure at the master slapd's backend.
In comes Berkeley DB's High Availability product. This looks fantastic
in terms of our Main Goal:
And this is the an argument I've been putting off; I contend that slapd
replication is a good thing, but that there is a compelling reason to
include Berkeley DB replication. If I rely solely on slapd for
replication I can run into trouble.
(Please correct my understanding of replication mechanisms.)
Right now, only masters/providers can modify their backends from client
requests. So if the master/provider goes down, the slaves/consumers may
have the data, but they cannot accept updates nor forward requests for
writes. I cannot think of a way that slaves/consumers could failover to
another master/provider to allow updates to happen; and if that did,
you've created the problem of having to sync/rebuild the provider when
it comes back up.
I started to imagine a situation where slave/consumers would have
multiple updateref url's that referred to slapd's running on top of the
Berkeley DB cluster; they could failover to subsequent ones after a
timeout. Alternatively, one could use round-robin DNS: sweet.
This is turning into quite a report, so I'll just add a few other
avenues I've explored.
PostgreSQL has inheritance as one of its DDL features; this might be
used to simplify the LDAP schema translation in such a way that might
be easily scriptable; if someone has already come up with a more
programmatic way to do this, please chime in. Most of the back-sql
setups I've seen are idiosyncratic, or only geared towards user data:
And then there's back-perl. I thought that for a moment I wouldn't have
to deal with the whole objectClass->table mapping problem by finding some
perl module to do the job for me and then using perl's DBI module to
make the final connection to PostgreSQL. Sorry dude. Again, let me know if
you've got something. I briefly tried to find some sort of repository
of known LDAP schema->SQL translations I could just load, no dice.
Finally, I know that life just isn't that easy and at the end of the day
there's just going to be all kinds of data in all kinds of places.