[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Openldap scalability issues





--On Monday, February 21, 2005 8:59 AM -0600 "Armbrust, Daniel C." <Armbrust.Daniel@mayo.edu> wrote:

I'm looking for some configuration advice to make openldap scale better
for databases that hold millions of records.

In the past we have been loading most of our large databases from ldif
with slapadd.  We have been successful in loading databases with 10 to 15
million entries.  The largest ones usually take a day or two - but we can
deal with that.  However, now I am trying to load a large database
programmatically, from a java application.

If I start with a blank, clean database - here is the behavior I see.

The first ~200,000 entries or so (times about 5 attributes per entry)
take about 2 minutes per 10,000 entries.

Then it starts getting slower.  By the time I get to entry 300,000 - it
is taking ~45 minutes per 10,000 entries.

Then it starts jumping around - sometimes taking a 1/2 hour per 10,000
sometimes 3 hours per 10,000.

Finally, usually somewhere around 600,000 concepts, it just kicks the
client out with some odd error message - it changes from run to run.  The
most recent example was:

javax.naming.NamingException: [LDAP: error code 80 - entry store failed]

After this happened - I could no longer connect to the ldap server with
any client - the server actually crashed when I tried to connect to it.
But then when I restart the server, it seems to run fine, at least for
browsing.  I currently don't have the ability to resume my inserts - so I
haven't been able to tell if the insert speed would be good again after a
restart, or if it would stay as slow as it was at the end.

The database size was only 3.6 GB when it failed - I have loaded 16 GB
databases with slapadd before.  Why does this work in slapadd - but not
when it is done with slapd?

Daniel,

I've done extensive testing in the last month of loading multi-million entry data into OpenLDAP. The rate at which you can load a database is always limited by the amount of cache available via DB_CONFIG. What this essentially says, is that you're exceeding the cache you supplied to OpenLDAP somewhere between 200k and 300k entries. After that, everything will start thrashing.

In OpenLDAP 2.3, there is a new quicktool option to slapadd, that will allow you to add large DB's much more quickly, but it also is bound by the DB Cache. Essentially, you must *always* provide an adequate amount of cache to the backend database if you want to be able to load it in quickly. How much cache is needed will in part depend entirely on how much indexing you do. The more indexing, the more cache. For me, with 20 million entries & 3 attributes indexed eq, I estimated it would take 42 GB of memory to keep from causing BDB to swap the cache. Somewhat prohibitive. ;)

You will never get the efficiency that slapadd has via your java application, as the slapadd tool is able to do manipulations since slapd isn't running that the java app will never be able to do. I would guess in your case that the slapd application is using up all available memory, and then failing after that point.

Note that the latest CDS release based on OpenLDAP 2.2 has the quicktool option available now.

--Quanah

--
Quanah Gibson-Mount
Product Engineer
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>