[Date Prev][Date Next]
Re: (ITS#3564) Programatic Insert Scaleability Problem
>With signifigant indexing enabled, openldap seems to be incapable of scaleing up
>to even moderate sized databases - when you are loading a running server (rather
>than an offline bulk insert)
Bulk loading of a running server is not specifically addressed in any
>If I start with a blank, clean database - here is the behavior I see.
>The first ~200,000 entries or so (times about 5 attributes per entry) take about
>2 minutes per 10,000 entries.
>Then it starts getting slower. By the time I get to entry 300,000 - it is
>taking ~45 minutes per 10,000 entries.
>Then it starts jumping around - sometimes taking a 1/2 hour per 10,000 sometimes
>3 hours per 10,000.
>Finally, usually somewhere around 600,000 concepts, it just kicks the client out
>with some odd error message - it changes from run to run. The most recent
>javax.naming.NamingException: [LDAP: error code 80 - entry store failed]
>After this happened - I could no longer connect to the ldap server with any
>client - the server actually crashed when I tried to connect to it. But then
>when I restart the server, it seems to run fine, at least for browsing. I
>currently don't have the ability to resume my inserts - so I haven't been able
>to tell if the insert speed would be good again after a restart, or if it would
>stay as slow as it was at the end.
Of course the server should never crash... We need to see a backtrace
from the crash to understand what has failed. You should probably also
run with debug level 0; there may be diagnostic messages from the
BerkeleyDB library as well that you're not seeing at the moment.
>The database size was only 3.6 GB when it failed - I have loaded 16 GB databases
>with slapadd before. Why does this work in slapadd - but not when it is done
At a guess, there may be a memory leak somewhere. But there's too little
information about the crash here to tell.
>set_cachesize 1 0 1
If that's all that you have in your DB_CONFIG, then you're definitely
limiting yourself. Remember that you MUST store the transaction logs on
a separate physical disk from your database files to get reasonable
write performance. This is stated several times in the Sleepycat
>It appears that the performance tanked when the database grew to about twice the
>size of the cache
Most likely because your logs are not configured properly.
>- and the whole thing crashed when it got to 3 times the size
>of the cache.
This is a problem that needs investigation.
>What is the point of a database, if it is limited to only holding twice the
>amount of what you can put into RAM?
We've built much larger databases fairly often. Once you exceed the size
of the RAM cache, speeds are obviously limited by your disk throughput.
Using more, faster disks is the only solution when you don't have any
RAM left to throw at the situation. Keeping the transaction logs
separate from the database disks will make a big difference in these cases.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
Symas: Premier OpenSource Development and Support