[Date Prev][Date Next] [Chronological] [Thread] [Top]

OpenLDAP system architecture?



Folks,

I'm going through the documentation at
<http://www.openldap.org/doc/admin24/>, the OpenLDAP FAQ-o-Matic at
<http://www.openldap.org/faq/data/cache/1.html>, and the archives of the
various Open-LDAP mailing lists, but I have not yet found anything that
discusses how one might want to architect a large-scale OpenLDAP system
with multiple masters, multiple slaves, etc... for best performance and
low latency.

In our case, we have OpenLDAP 2.3.something (a few versions behind the
official latest stable release), and we've recently hit our four
millionth object (at a large University with something like 48,000
students, 2700 faculty, and 19,000 employees), and we're running into
some performance issues that are going to keep us from rolling out some
other large projects, at least until we can get the problems resolved.


I do not yet understand a great deal about how our existing OpenLDAP
systems are designed, but I am curious to learn what kinds of
recommendations you folks would have for a large scale system like this.

In the far, dark, distant past, I know that OpenLDAP did not handle
situations well when you had both updates and reads occurring on the
same system, so the recommendation at the time was to make all updates
on the master server, then replicate that out to the slaves where all
the read operations would occur.  You could even go so far as to set up
slaves on pretty much every single major client machine, for maximum
distribution and replication of the data, and maximum scalability of the
overall LDAP system.

I know that modern versions of OpenLDAP are able to handle a mix of both
updates and reads much better, so that the old style architecture is not
so necessary.  But for a large-scale system like we have, would it not
be wise to use the old-style architecture for maximum performance and
scalability?

If you did use a multi-master cluster pair environment that handled all
the updates and all the LDAP queries that were generated, what kind of
performance do you think you should reasonably be able to get with the
latest version of 2.4.whatever on high-end hardware, and what kind of
hardware would you consider to be "high-end" for that environment?  Is
CPU more important, or RAM, or disk space/latency?

Alternatively, if you went to a three-level master(s)->proxies->slaves
architecture [0], what kind of performance would you expect to be able
to get, and how many machines would you expect that to be able to scale
to?  Are there any other major issues to be concerned about with that
kind of architecture, like latency of updates getting pushed out to the
leaf-node slaves?

How about the ultimate maximum distribution scenario, where you put an
LDAP slave on virtually every major LDAP client machine?


Any and all advice you can provide would be appreciated, and in
particular I would greatly appreciate it if you can provide any
references to documentation, FAQs, mailing list archives where I can
read more.

Thanks!






[0] Is this test045?  As I believe is mentioned at
<http://www.openldap.org/lists/openldap-software/200707/msg00320.html>?

-- 
Brad Knowles <b.knowles@its.utexas.edu>
Sr. System Administrator, UT Austin ITS-Unix
COM 24 | 5-9342