[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#3564) Programatic Insert Scaleability Problem



Full_Name: Dan Armbrust
Version: 2.2.23
OS: Fedora Core 3
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (129.176.151.126)


With signifigant indexing enabled, openldap seems to be incapable of scaleing up
to even moderate sized databases - when you are loading a running server (rather
than an offline bulk insert)

If I start with a blank, clean database - here is the behavior I see.

The first ~200,000 entries or so (times about 5 attributes per entry) take about
2 minutes per 10,000 entries.

Then it starts getting slower.  By the time I get to entry 300,000 - it is
taking ~45 minutes per 10,000 entries.

Then it starts jumping around - sometimes taking a 1/2 hour per 10,000 sometimes
3 hours per 10,000.

Finally, usually somewhere around 600,000 concepts, it just kicks the client out
with some odd error message - it changes from run to run.  The most recent
example was:

javax.naming.NamingException: [LDAP: error code 80 - entry store failed]

After this happened - I could no longer connect to the ldap server with any
client - the server actually crashed when I tried to connect to it.  But then
when I restart the server, it seems to run fine, at least for browsing.  I
currently don't have the ability to resume my inserts - so I haven't been able
to tell if the insert speed would be good again after a restart, or if it would
stay as slow as it was at the end.

The database size was only 3.6 GB when it failed - I have loaded 16 GB databases
with slapadd before.  Why does this work in slapadd - but not when it is done
with slapd?

Config info:
Openldap 2.2.23
BerkeleyDB 4.2.52(.2)

bdb backend
DB_CONFIG file:
set_flags       DB_TXN_NOSYNC
set_flags       DB_TXN_NOT_DURABLE
set_cachesize   1       0       1


Useful? bits from the conf file:

schemacheck     on
idletimeout     14400
threads         150
sizelimit       6000

database        bdb
suffix          "service=test,dc=LexGrid,dc=org"
directory       /localwork/ldap/database/dbdanTest
checkpoint      512     30

index           objectClass eq
index           conceptCode eq
index           language pres,eq
index           dc eq
index           sourceConcept,targetConcept,association,presentationId eq
index           text,entityDescription pres,eq,sub,subany


You can see the schema here if you desire:
http://informatics.mayo.edu/index.php?page=102


It appears that the performance tanked when the database grew to about twice the
size of the cache - and the whole thing crashed when it got to 3 times the size
of the cache.

What is the point of a database, if it is limited to only holding twice the
amount of what you can put into RAM?