[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: why is adding records slow?



Howdy,
	
	For reference sake my indexing setup was:

index	objectClass,uid,uidNumber,gidNumber	eq
index	cn,mail,surname,givenname		eq,subinitial

	The ldif records I was adding contained a cn, sn and objectClass
component (one record quoted bleow)

#entry 0
dn: cn=Person zero0, dc=ricky, dc=com
cn: Person zero0
objectclass: person
sn: zero0

	I'm an LDAP newbie and had not as of yet considered indexing on my
performance. Thanks to you folks who pointed it out. 

	So I commented out the indexing lines from slapd.conf, restarted slapd
and ran a new 10K record add and it went about 3X faster. That is,
adding 10,000 records (like the above (very simple, very flat)) took
2516 seconds as opposed to 7692 seconds. This is a record add rate of
3.97 per second. (In this thread, I am not concerned with what a lack of
indexing will do to my read performance, I'm focusd on write
performance). 

	One persond suggested that I try slapadd as opposed to ldapadd. slapadd
suggests/requires that slapd be off-line during the operation. The run
of adding 10K records (with indexing commented out in may slapd.conf
(does that matter in this caes?)) took 40 seconds. This should be highly
diagnostic. The man page for slapadd says "slapadd  may not provide
naming or schema check". Another response from Gabor on this thread also
points out that ldapadd to slapd communication also involves asn1
processing and network processing.

	Now, I am really reaching on this next bit. It's not like I've had a
look at the source or anything, I'm just tossing out a guess for those
who know more than me (that's about everyone) to shoot down....

	'What if' the "naming and schema checking" involve several network
round trips between the client and server with several database reads
involved each time. If so, then could the process be sped up with some
type of nameing/schema caching on the client side? Or maybe the client
could scan the entire ldif file and constuct a list of all naming/schema
checks it will need for the duration and request the information for all
the checks in bulk, then comming back for a second pass through the ldif
for the actual record adds. It seems on the surface that the client side
may be the best target for many-write-speed-improvement. Any ideas on
makeing a `ldapBulkAdd` client run more efficient?

	Just tossing out ideas (and the are probably baseless anyway).
	
-- 
  Ricky Charlet   : Redcreek Communications   : usa (510) 795-6903




Ricky Charlet wrote:
> 
> Ricky Charlet wrote:
> >
> > Howdy,
> >
> >         If anyone had tried using Berkeley DB as the underlying DB instead of
> > gdbm, did your writes get any faster?
> >
> >         Apart from that, where can I find a list of tips and tricks to make my
> > writes get faster?
> >
> > Thanks in advance...
> >
> > --
> >   Ricky Charlet   : Redcreek Communications   : usa (510) 795-6903
> 
> Howdy,
> 
>         OK, I know that LDAPs design goal was to optimize reading/browsing
> records, not adding/modifying records. But can someone tell me the
> fundamental reasons that adding records do take such a long time?
> 
>         Even though I knew that record adding was not optimized, I am really
> very supprised at how slow it is. I ran a simple test of adding 10,000
> person records to a database. After 30 minutes it was about 1/4
> finished. According to my measurements it was adding records at about
> 1.3/second. That is plainly unworkable at my intended scale.
> 
>         This test is really a great deal simpler than what I really want to
> acomplish though similar in scale. Here are the records I was adding:
> 
> #entry 1
> dn: cn=Person A1, dc=ricky, dc=com
> cn: Person A1
> objectclass: person
> sn: A1
> 
> #entry 2
> dn: cn=Person A2, dc=ricky, dc=com
> cn: Person A2
> objectclass: person
> sn: A2
> 
> and so on up to "cn: Person 10000"
> 
> The command line I used was `ldapadd -x -D "cn=Manager,dc=ricky,dc=com"
> -f tenK.ldif -w *******`
> 
> --
>   Ricky Charlet   : Redcreek Communications   : usa (510) 795-6903

-- 
  Ricky Charlet   : Redcreek Communications   : usa (510) 795-6903