[Date Prev][Date Next]
RE: slapadd(8), database order
> -----Original Message-----
> From: Kurt D. Zeilenga [mailto:Kurt@OpenLDAP.org]
> There are two obvious ways to "reverse the DN"...
> one is to just write it backwards (uid=kdz,dc=example,dc=com)
> becomes (moc=cd,elpmaxe=cd,zdk=diu) the other is to reverse
> the SEQUENCE of RDNs (dc=com,dc=example,uid=kdz). Both
> should provide good better "locality" using default btree
> ordering routines, but generation costs are different.
> Likely the former is cheaper to generate.
Right. I thought about just doing the former but actually tested with the
latter. Maybe I went wrong there, as a simple string reverse requires one
less traversal through the string than resequencing the RDNs.
> One could use
> DN->id, but write a ordering function which worked backwards.
I hadn't considered that, but it probably would have been easier.
> Anways, give a btree with such ordering, one can optimize
> search by 1) starting at the baseObject and 2) stopping
> on first candidate which is not under the baseObject.
Yes, that's what I had in mind. A similar approach would work for OneLevel as
well. But then I realized that this is how back-hdb already works, and it
might make sense to just finish back-hdb instead.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
Symas: Premier OpenSource Development and Support
> At 03:41 PM 2002-09-09, Howard Chu wrote:
> >> -----Original Message-----
> >> From: Kurt D. Zeilenga [mailto:Kurt@OpenLDAP.org]
> >> Sent: Monday, September 09, 2002 11:16 AM
> >> To: Howard Chu
> >> Cc: openldap-devel@OpenLDAP.org
> >> Subject: Re: slapadd(8), database order
> >> Maybe the thing to do is:
> >> a) have back-bdb check for existance of parents
> >> b) provide an LDIF ordering tool
> >> or, provide a ID defragmentation tool :-).
> >sigh... While all of that was tumbling thru my head I did an
> experiment with
> >the dn2id database. The results were poorer than I expected but I
> thought I'd
> >pass along the highlights.
> >I noticed that dn2id is maintained using prefixes on each DN.
> This separates
> >entries that are related to each other, i.e. we really aren't taking
> >advantage of any locality of reference in the B-tree. My first
> thought was to
> >just turn the prefix into a suffix. This way, the base, subtree,
> and onelevel
> >index slots for an entry will always be created next to each other in the
> >database. But still, entries that are adjacent in the DIT will be
> far apart
> >because the LDAP DN order doesn't logically sort entries
> together. So I wrote
> >some code to flip the rdn order around for the dn2id keys. I also
> changed the
> >Prefix characters to a set that sorts in ascending order. (base=!, sub=$,
> >one=* - this is the order we create the keys in the code, and the
> >are ascending in both ASCII and EBCDIC.)
> >Statistics from slapadd showed that this approach was slower than
> the current
> >approach. I was quite surprised. db_stat showed that the number of page
> >requests satisfied by the cache was higher; for a 450,000 entry
> database the
> >percentage of hits with a 2MB cache went from 88% to 100%, so I sort of
> >accomplished what I intended, but also the actual number of page requests
> >increased, so the amount of work being done overall was higher.
> >The "time" results for the current code:
> >691.720u 118.380s 18:09.39 74.3% 0+0k 0+0io 1628pf+0w
> >"time" for the experimental code:
> >727.810u 90.760s 19:07.57 71.3% 0+0k 0+0io 1630pf+0w
> >The better cache use saved 27.6 seconds of kernel/IO time, but cost 36
> >seconds of user time. I guess this approach causes more page splits, but
> >don't really know.
> >It occurs to me that this dn2id key layout may still be useful;
> it makes an
> >explicit DN_SUBTREE index unnecessary.
> > -- Howard Chu
> > Chief Architect, Symas Corp. Director, Highland Sun
> > http://www.symas.com http://highlandsun.com/hyc
> > Symas: Premier OpenSource Development and Support