[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Indexing thoughts

To: Quanah Gibson-Mount <quanah@stanford.edu>
Subject: Re: Indexing thoughts
From: Howard Chu <hyc@symas.com>
Date: Fri, 06 Oct 2006 14:20:10 -0700
Cc: openldap-devel@openldap.org
In-reply-to: <B3B98E8633E08A9656CAB435@deus-ex.stanford.edu>
References: <B3B98E8633E08A9656CAB435@deus-ex.stanford.edu>
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060911 Netscape/7.2 (ax) Firefox/1.5 SeaMonkey/1.5a

Quanah Gibson-Mount wrote:

Stanford has been looking into implementing the value-sort attributes overlay, and realized some problems with using it, if the attribute to be weighted is also indexed and you will be using weights. Primarily, the problem stems from this:
The unweighted form of the data could look like:
suaffiliation: stanford:staff
suaffiliation: stanford:faculty
etc
The weighted form of the data looks like:
suaffiliation: {2}stanford:staff
suaffiliation: {1}stanford:faculty
This means if you use the data for indexing (in particular, in my case, eq indexing), it will no longer be possible to use filters of the type (suaffiliation=stanford:staff). Which is problematic when many applications do that very thing (and internal ACL's use it as well).

I thought a potential solution (not possible at this time, per Howard) would to be able to support something like multiple indices (sub indices?) that would actually index the data in both its weighted and unweighted form, if the val-sort overlay was present. It is also something I thought could be potentially useful for other overlays (how, I'm not sure). But the ability to have indexing behave differently based on different factors does seem potentially useful.

In addition, things get more complicated when telephoneNumber (and things using its syntax) are involved. Mostly because {}'s are not valid characters per its syntax, meaning you can't sort the attribute via weights. The ability to tweak SYNTAX based on overlays I guess would be the solution, but sounds ugly.

Anyhow, just a set of thoughts, I don't know how practical implementing such a thing would really be.

It strikes me that what you're after has something similar in spirit to the component-based matching feature; it's possible the existing work could be adapted to these needs as well.

Similar issues arise with the X-ORDERED 'VALUES' extension being used in back-config. While there's no indexing to consider in that context, the fact that "{}" are not legal in various syntaxes makes the spec a bit troublesome. (The implementation just strips the prefix before validation occurs, to avoid the issue, but that doesn't seem to be a very portable or friendly thing to do.)

Kurt suggested the creation of a new syntax that wraps the actual syntax as a possible solution, allowing the prefix to be an optional component followed by the actual value, each governed by their own syntax.

The one concern I have with this approach is that it doesn't allow us to simply add the desired behavior onto existing attributeTypes, as we currently do.

There's another related issue that worries me here; we're planning to overhaul the workings of multi-valued attributes in general. Keeping them fully sorted will allow us to get rid of all the linear searching we currently do to find individual values. But this type of sorting is going to be completely at odds with the static ordering that X-ORDERED 'VALUES' preserves. I guess we can manage that indirectly though...

--
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/

References:
- Indexing thoughts
  - From: Quanah Gibson-Mount <quanah@stanford.edu>

Prev by Date: Indexing thoughts
Next by Date: Syncrepl client API
Index(es):
- Chronological
- Thread