[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: SLAP_INDEX_SUBSTR_ANY_LEN & co



Hallvard B Furuseth wrote:

Howard Chu writes:


Ah, before you go there, we have a patch I've been meaning to commit
that makes all of these lengths configurable in slapd.conf.



Great!

But if the answer is simple and the patch will wait some days, could
you explain anyway? I'll probably try to play with these constants
tomorrow.


OK.

You already know what IF_MINLEN and ANY_LEN are for; they control both index generation (which occurs when attributes are stored) and index lookup (which occurs when search filters are evaluated). The other two values only affect index lookup:

IF_MAXLEN indicates that only a maximum number of characters are used for initial/final indexing. So, given these constants, if you have filters "cn=abcd*" and "cn=abcdefgh*" they will both generate the same list of candidates, because the characters beyond IF_MAXLEN are ignored when generating the filter index keys.

ANY_STEP has to do with the sliding window that is used to generate a substring index keys for a value. For example, when indexing the attribute "cn=abcdefgh" with a STEP size of 2 a hash key is generated for these parts:
abcd
cdef
efgh


The search candidate list is the intersection of (all entries matching the first hash key, all entries matching the next key, and all entries matching the last key). Changing the STEP size therefore controls how many hash buckets will be examined when doing a search.


I should point out that our patch also fixes the initial/final behavior: if a filter is provided that exceeds the MAXLEN, we no longer ignore the excess characters. Instead we combine them with an ANY substring index lookup, so that
cn=abcdefgh*
is internally equivalent to
cn=abcd*defgh*


Naturally this doesn't work if subany indexing was not used...



How do these slap.h constants work?  I take it IF_MINLEN means that
initial/final substrings of 2 chars or more are indexed, and ANY_LEN
means substrings of 4 chars or more are indexed.  But what do IF_MAXLEN
and ANY_STEP do, and how do these constants interact?

What i want is to index substrings of 3 or more chars, so I'll modify at
least ANY_LEN - but I don't know what else.  I suspect I asked before,
but buried the answer somewhere:-(

/* constants for initial/final substrings indices */
#ifndef SLAP_INDEX_SUBSTR_IF_MINLEN
# define SLAP_INDEX_SUBSTR_IF_MINLEN	2
#endif
#ifndef SLAP_INDEX_SUBSTR_IF_MAXLEN
# define SLAP_INDEX_SUBSTR_IF_MAXLEN	4
#endif

/* constants for any substrings indices */
#ifndef SLAP_INDEX_SUBSTR_ANY_LEN
# define SLAP_INDEX_SUBSTR_ANY_LEN 4
#endif
#ifndef SLAP_INDEX_SUBSTR_ANY_STEP
# define SLAP_INDEX_SUBSTR_ANY_STEP 2
#endif



-- Hallvard






--
 -- Howard Chu
 Chief Architect, Symas Corp.       Director, Highland Sun
 http://www.symas.com               http://highlandsun.com/hyc
 Symas: Premier OpenSource Development and Support