[Date Prev][Date Next]
RE: Sleepycat and hash functions
> -----Original Message-----
> From: owner-openldap-software@OpenLDAP.org
> [mailto:owner-openldap-software@OpenLDAP.org]On Behalf Of Kurt D. Zeilenga
> At 11:56 AM 12/14/2002, Hallvard B Furuseth wrote:
> >http://www.openldap.org/faq/index.cgi?file=756 says one can use BDB's
> >db_dump utility for on-line backups while slapd is running.
> >The manpage continues:
> > Dumping and reloading Btree databases that use
> user-defined prefix or
> > comparison functions will result in new databases that use
> the default
> > prefix and comparison functions. In this case, it is quite likely
> > that the database will be damaged beyond repair permitting neither
> > record storage or retrieval.
> >Since db_dump is recommended, I take it back-bdb does not use Btree
> It does... but not with a user-defined comparison function.
> User-defined comparison function is only defined for index
> databases (which use DB_HASH). So, if they are slower, one
> can always recreate them using slapindex(8).
Actually, there are a number of issues here that will impact a Little-Endian
machine. BerkeleyDB's default comparison functions are all byte-oriented,
like strcmp and memcmp. When comparing integer data stored in Little-Endian
order, the data items will not sort into proper numerical order. The reason
we used a non-default comparison function was to preserve the proper sort
order without changing the stored byte-order. Note that on a Big-Endian
machine there are no problems whatsoever.
The non-default comparison function affects both the id2entry database (which
is a Btree) and the index databases. Even though the index databases are
keyed with Hashes, their data are numeric (lists of entry IDs) using the
Sorted Duplicates feature. I believe, on a Little-Endian machine, using
db_dump on an index database will fail because the data items "are out of
The id2entry database is keyed on entry ID, and it does indeed use a
non-default comparison function. db_dump on a Little-Endian machine should
fail there too.
A similar problem used to exist in back-ldbm, but it only affected the
id2entry database, and it was fixed by byteswapping the entry IDs before
Given the large volume of byteswapping that would be needed to correct this
sorting issue for back-bdb's index databases, I chose to instead use an
alternate compare function. I think we should update the documentation and
note that db_dump/db_load must not be used on Intel and other Little-Endian
I'm pretty sure all of this was discussed on the -devel list 'way back when
we were implementing...
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
Symas: Premier OpenSource Development and Support