[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
(ITS#3564) Programatic Insert Scaleability Problem
Full_Name: Dan Armbrust
Version: 2.2.23
OS: Fedora Core 3
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (129.176.151.126)
With signifigant indexing enabled, openldap seems to be incapable of scaleing up
to even moderate sized databases - when you are loading a running server (rather
than an offline bulk insert)
If I start with a blank, clean database - here is the behavior I see.
The first ~200,000 entries or so (times about 5 attributes per entry) take about
2 minutes per 10,000 entries.
Then it starts getting slower. By the time I get to entry 300,000 - it is
taking ~45 minutes per 10,000 entries.
Then it starts jumping around - sometimes taking a 1/2 hour per 10,000 sometimes
3 hours per 10,000.
Finally, usually somewhere around 600,000 concepts, it just kicks the client out
with some odd error message - it changes from run to run. The most recent
example was:
javax.naming.NamingException: [LDAP: error code 80 - entry store failed]
After this happened - I could no longer connect to the ldap server with any
client - the server actually crashed when I tried to connect to it. But then
when I restart the server, it seems to run fine, at least for browsing. I
currently don't have the ability to resume my inserts - so I haven't been able
to tell if the insert speed would be good again after a restart, or if it would
stay as slow as it was at the end.
The database size was only 3.6 GB when it failed - I have loaded 16 GB databases
with slapadd before. Why does this work in slapadd - but not when it is done
with slapd?
Config info:
Openldap 2.2.23
BerkeleyDB 4.2.52(.2)
bdb backend
DB_CONFIG file:
set_flags DB_TXN_NOSYNC
set_flags DB_TXN_NOT_DURABLE
set_cachesize 1 0 1
Useful? bits from the conf file:
schemacheck on
idletimeout 14400
threads 150
sizelimit 6000
database bdb
suffix "service=test,dc=LexGrid,dc=org"
directory /localwork/ldap/database/dbdanTest
checkpoint 512 30
index objectClass eq
index conceptCode eq
index language pres,eq
index dc eq
index sourceConcept,targetConcept,association,presentationId eq
index text,entityDescription pres,eq,sub,subany
You can see the schema here if you desire:
http://informatics.mayo.edu/index.php?page=102
It appears that the performance tanked when the database grew to about twice the
size of the cache - and the whole thing crashed when it got to 3 times the size
of the cache.
What is the point of a database, if it is limited to only holding twice the
amount of what you can put into RAM?