[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: back-bdb bdb_idl_delete_key



I thought that back-ldbm should have the same issues but now I see that
back-ldbm's index.c simply ignores the return code from key_change.

back-bdb's bdb_idl_insert_key silently turns rc = DB_KEYEXIST into rc = 0. I
don't think the sorted key-array idea will actually work because it doesn't
deal with duplicates due to other collisions. (subtypes/supertypes, mainly.)

I guess we can just ignore DB_KEYEXIST for add and DB_NOTFOUND for delete. I
would only do this for key_change though. dn2id also uses the insert/delete
key routines, and the dn2id index should never have duplicates.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

> -----Original Message-----
> From: owner-openldap-devel@OpenLDAP.org
> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Howard Chu
> Sent: Thursday, August 29, 2002 8:10 PM
> To: openldap-devel@OpenLDAP.org
> Subject: back-bdb bdb_idl_delete_key
>
>
> The hashing algorithm used to produce index keys for attribute indices
> frequently produces the same hash key many times for the same attribute.
> (Think of a substring index on a name with repeated character
> sequences like
> Mississippi or abracadabra.) This causes a number of problems for us:
>   when adding an array of index keys, adding a key that has already been
> added produces a "key exists, duplicates not allowed" error and
> the index add
> of that attribute stops.
>   when deleting an array of index keys, deleting a key that was already
> deleted produces a "notfound" error and again, the index delete stops.
>
> This means the array of index keys is only partly processed, and so certain
> attributes are only partially "matchable" when searching for them. A quick
> fix would be to ignore any KEYEXIST errors for index adds and
> NOTFOUND errors
> for deletes. Perhaps a better fix would be to have the indexers
> sort the key
> array and strip duplicates. Any suggestions?
>
> In the above example, "Mississippi" might fail to match "*ppi*" because
> indexing aborted at the second "issi" hash...
>
> (Ugh. I'm pretty sure we discussed this on this list many months ago, and
> dismissed it because it didn't seem harmful at the time. I didn't consider
> the case of the duplicates causing indexing to bail out before the
> entire set
> of keys was processed...)
>
>   -- Howard Chu
>   Chief Architect, Symas Corp.       Director, Highland Sun
>   http://www.symas.com               http://highlandsun.com/hyc
>   Symas: Premier OpenSource Development and Support
>
>
>