[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: substring index oddity





--On Wednesday, August 24, 2005 1:46 PM -0500 John Madden <jmadden@ivytech.edu> wrote:

It is quite clear in the docs that the default minimum substring indexing
starts at 3 characters.  So the "*2" and the "*22" substring searches
will not be using the index at all unless you've tweaked this.

No, I've made no mods. So "*22" shouldn't be on an index, yet it's quite fast. That does explain why "*2" is slow though.

BTW, if you have your loglevel up to around 256, do you see this message?

bdb_substring_candidates: (uid) index_param failed (18)

Nope, no such messages.

So I'm guessing that "*XXX*" is one character short index wise.  That may
or may not be by design.

It seems that having the glob on the end of the string is perhaps related to things being slow, although I've done so many tests I don't remember clearly.

I'm guessing it is this:

    index_substr_any_step <integer>
         Specify the steps used in subany  index  lookups.  This
         value  sets  the  offset  for  the segments of a filter
         string that are processed for a  subany  index  lookup.
         The default is 2. For example, with the default values,
         a  search  using  this  filter  "cn=*abcdefgh*"   would
         generate index lookups for "abcd", "cdef", and "efgh".


because these seem like they wouldn't apply:

    index_substr_if_minlen <integer>
         Specify the minimum length for subinitial and  subfinal
         indices.  An  attribute  value  must have at least this
         many  characters  in  order  to  be  processed  by  the
         indexing functions. The default is 2.

    index_substr_if_maxlen <integer>
         Specify the maximum length for subinitial and  subfinal
         indices.  Only  this  many  characters  of an attribute
         value will be processed by the indexing functions;  any
         excess characters are ignored. The default is 4.


So something like "*lee*" would just generate "lee" and "e" if I'm reading it right, and then the "e" search would fail...


--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html