[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
mmap'd DB
- To: OpenLDAP-devel@openldap.org
- Subject: mmap'd DB
- From: Howard Chu <hyc@symas.com>
- Date: Wed, 22 Jun 2011 18:26:51 -0700
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0a1) Gecko/20110612 Firefox/7.0a1 SeaMonkey/2.4a1
The main goal of mdb is to eliminate extra copies of data in various caches. A
secondary goal is simplifying config (by eliminating extra caches, there's
nothing left to tune/configure). One of the key requirements for this to work
is to make the on-disk Entry format identical to the in-memory format. Since
the in-memory format is based on pointers, we seem to need a way to nail down
the memory map to a fixed address that is constant from run to run. (Just
re-capping...)
The tricky part is that Entries contain AttributeDescriptions and these
obviously depend on whatever schema is loaded on a particular run. If you take
no special steps, then adding or deleting schema elements from run to run will
invalidate their addresses, and make the on-disk pointers useless.
I had a notion to replace AttributeDescription pointers with indices into an
array. The array would be persisted into a file, and would only ever grow;
items would never be deleted from it. Then Entries could reference the array
indices instead of the pointers. A scheme like this could work, but the sheer
volume of code that directly (de-)references AD pointers is pretty huge. Not
sure if it's the right path to take.
Another option was to setup a dedicated ad_alloc() function that carves up its
own mmap'd file. Then AD pointers will always be constant from run to run. The
trouble here is how to configure the path to this file; all of the hard-coded
schema is generated before we ever get to reading the config.
A possible approach is to first allocate the AD pointers in an anonymous
mmap'd region, then once the config file is read we can open the configured
file and pull from there. For this to work we have to reserve a fixed size
chunk of memory for the hard-coded schema, so that if we add/delete other
hard-coded schema elements down the road, the map remains useful. All in all
it seems extremely fragile and a major pain for compatibility between revisions.
Another possibility is to collect all the schema elements at build time, and
generate a .c file containing a seed AD map which is hard-coded (and again, is
only allowed to grow, no elements modified in place or deleted) so that the
core AD map is always the same from run to run. The more I think of it, the
more I think this will work. It's essentially the same solution we used to get
rid of the ucdata-path item...
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/