[Date Prev][Date Next]
RE: back-bdb bdb_idl_delete_key
I thought that back-ldbm should have the same issues but now I see that
back-ldbm's index.c simply ignores the return code from key_change.
back-bdb's bdb_idl_insert_key silently turns rc = DB_KEYEXIST into rc = 0. I
don't think the sorted key-array idea will actually work because it doesn't
deal with duplicates due to other collisions. (subtypes/supertypes, mainly.)
I guess we can just ignore DB_KEYEXIST for add and DB_NOTFOUND for delete. I
would only do this for key_change though. dn2id also uses the insert/delete
key routines, and the dn2id index should never have duplicates.
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
Symas: Premier OpenSource Development and Support
> -----Original Message-----
> From: owner-openldap-devel@OpenLDAP.org
> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Howard Chu
> Sent: Thursday, August 29, 2002 8:10 PM
> To: openldap-devel@OpenLDAP.org
> Subject: back-bdb bdb_idl_delete_key
> The hashing algorithm used to produce index keys for attribute indices
> frequently produces the same hash key many times for the same attribute.
> (Think of a substring index on a name with repeated character
> sequences like
> Mississippi or abracadabra.) This causes a number of problems for us:
> when adding an array of index keys, adding a key that has already been
> added produces a "key exists, duplicates not allowed" error and
> the index add
> of that attribute stops.
> when deleting an array of index keys, deleting a key that was already
> deleted produces a "notfound" error and again, the index delete stops.
> This means the array of index keys is only partly processed, and so certain
> attributes are only partially "matchable" when searching for them. A quick
> fix would be to ignore any KEYEXIST errors for index adds and
> NOTFOUND errors
> for deletes. Perhaps a better fix would be to have the indexers
> sort the key
> array and strip duplicates. Any suggestions?
> In the above example, "Mississippi" might fail to match "*ppi*" because
> indexing aborted at the second "issi" hash...
> (Ugh. I'm pretty sure we discussed this on this list many months ago, and
> dismissed it because it didn't seem harmful at the time. I didn't consider
> the case of the duplicates causing indexing to bail out before the
> entire set
> of keys was processed...)
> -- Howard Chu
> Chief Architect, Symas Corp. Director, Highland Sun
> http://www.symas.com http://highlandsun.com/hyc
> Symas: Premier OpenSource Development and Support