[Date Prev][Date Next] [Chronological] [Thread] [Top]

mmap'd DB



The main goal of mdb is to eliminate extra copies of data in various caches. A secondary goal is simplifying config (by eliminating extra caches, there's nothing left to tune/configure). One of the key requirements for this to work is to make the on-disk Entry format identical to the in-memory format. Since the in-memory format is based on pointers, we seem to need a way to nail down the memory map to a fixed address that is constant from run to run. (Just re-capping...)

The tricky part is that Entries contain AttributeDescriptions and these obviously depend on whatever schema is loaded on a particular run. If you take no special steps, then adding or deleting schema elements from run to run will invalidate their addresses, and make the on-disk pointers useless.

I had a notion to replace AttributeDescription pointers with indices into an array. The array would be persisted into a file, and would only ever grow; items would never be deleted from it. Then Entries could reference the array indices instead of the pointers. A scheme like this could work, but the sheer volume of code that directly (de-)references AD pointers is pretty huge. Not sure if it's the right path to take.

Another option was to setup a dedicated ad_alloc() function that carves up its own mmap'd file. Then AD pointers will always be constant from run to run. The trouble here is how to configure the path to this file; all of the hard-coded schema is generated before we ever get to reading the config.

A possible approach is to first allocate the AD pointers in an anonymous mmap'd region, then once the config file is read we can open the configured file and pull from there. For this to work we have to reserve a fixed size chunk of memory for the hard-coded schema, so that if we add/delete other hard-coded schema elements down the road, the map remains useful. All in all it seems extremely fragile and a major pain for compatibility between revisions.

Another possibility is to collect all the schema elements at build time, and generate a .c file containing a seed AD map which is hard-coded (and again, is only allowed to grow, no elements modified in place or deleted) so that the core AD map is always the same from run to run. The more I think of it, the more I think this will work. It's essentially the same solution we used to get rid of the ucdata-path item...
--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/