[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: BDB entry cache submission

You wrote:
> > -----Original Message-----
> > From: owner-openldap-devel@OpenLDAP.org
> > [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Marijn Meijles
> > I looked at the cvs and this will be deadlock galore for reasons
> > I've outlined some time ago. Have fun hunting ;)
> I'm quite certain that this code will undergo a great deal of revision.
> All I said was that it works under a particular set of known circumstances...
Just trying to help.

> Meanwhile, now that we're all back from Christmas break and recovered from
> all the New Years resolutions, how goes your super-duper new backend design?
All right. Although I'm not quite finished with it. During christmas I like
totally had it with ldap ;) We reimplemented the cache and lru (highly
concurrent avl and a bubble lru) and it's not our main job to hack ldap,
unfortunately. (Next issue we'll address will probably be replication)
However, this is a list of the hard stuff I'm trying to put into it:
- constant time
	-tree context ops (parent, children stuff)
	-startup time

My current 'direction' involves labeling the components of a dn each with
an unique id. By picking the right way to construct the id AND employing
the sorting order of a btree, you can do all sorts of cool things but
I have not yet decided which cool things to pick ;)

> I gave some thought to your complaint about back-hdb cheating by reading the
> entire tree into RAM. I was almost motivated enough to rewrite it into a more
> traditional tree structure (nodes point to children instead of nodes point to
> parent). This would allow navigating the DIT without being forced to read it
> all in at once. It could allow servers with relatively small amounts of memory
> to handle inordinately large databases. But ultimately, there's not much to be
> gained here. If your database is too big relative to your working RAM, the
> overhead of the internal Btree pages will still prevent your small server from
> effectively servicing requests. Rather than continually incur the overhead of
> traversing both the dn2id Btree pages and the id2entry pages, it's still more
> efficient to traverse the entire dn2parent database a single time and then
> manage that independently, so that the BDB library really just has to deal with
> the id2entry traversals.
One of my greatest objections against this scheme is a very practical one.
If you run an important database you want to minimize downtime. In my
previous job I was also the administrator of openldap databases and in case
of deep shit (hanging raid boxes, murphy stuff ;) ) you want the database
to be available the minute you fire it up, believe me. Look at filesystem
developments for example, stuff like softupdates and journaling all minimize
downtime. Although you could dump your 'tree-cache' to disk to enhance startup time,
it'll be invalid at the time you need it the most (after a crash).

Your argument above seems a bit strange, I think I miss your point. As I see
it, I have to do one more lookup per dn (dn2id), which you have in memory.
However, this is greatly alleviated by the entry cache. 
You seem to be saying: 'if your memory is too small, you have to get memory
anyway, so you might as well buy some extra' ;)

For a minute, lat's talk principles.  By 'caching' info about ALL nodes,
you waste a lot more space than when you employ a fixed size on-demand
cache. To me, it would be better to design a system that uses memory to
speed up frequently used stuff and fill it on-demand as this uses your
memory more efficient and will ultimately be faster.

But please enlighten me with some counter arguments!

This text is ROT-52 encrypted