[Date Prev][Date Next]
Re: substring index oddity
--On Wednesday, August 24, 2005 12:11 PM -0700 Quanah Gibson-Mount
--On Wednesday, August 24, 2005 1:46 PM -0500 John Madden
It is quite clear in the docs that the default minimum substring
indexing starts at 3 characters. So the "*2" and the "*22" substring
searches will not be using the index at all unless you've tweaked this.
No, I've made no mods. So "*22" shouldn't be on an index, yet it's quite
fast. That does explain why "*2" is slow though.
BTW, if you have your loglevel up to around 256, do you see this
bdb_substring_candidates: (uid) index_param failed (18)
Nope, no such messages.
So I'm guessing that "*XXX*" is one character short index wise. That
may or may not be by design.
It seems that having the glob on the end of the string is perhaps related
to things being slow, although I've done so many tests I don't remember
I'm guessing it is this:
Specify the steps used in subany index lookups. This
value sets the offset for the segments of a filter
string that are processed for a subany index lookup.
The default is 2. For example, with the default values,
a search using this filter "cn=*abcdefgh*" would
generate index lookups for "abcd", "cdef", and "efgh".
So something like "*lee*" would just generate "lee" and "e" if I'm
reading it right, and then the "e" search would fail...
Actually, looking it over, I'm guessing it is this:
Specify the length used for subany indices. An
attribute value must have at least this many characters
in order to be processed. Attribute values longer than
this length will be processed in segments of this
length. The default is 4. The subany index will also be
used in subinitial and subfinal index lookups when the
filter string is longer than the index_substr_if_maxlen
I bet if you changed that to "3" from "4" it would work right...
Principal Software Developer
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html