[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: substring index oddity





--On Wednesday, August 24, 2005 11:08 AM -0500 John Madden <jmadden@ivytech.edu> wrote:

"uid=*0371*" dn
# numResponses: 125
# numEntries: 124
real    0m0.052s

Further research on the "allidsthreshold" concept mentioned in the old list thread lead me to SLAPD_LDBM_MIN_MAXIDS, which, at 8192-4, is likely too low for a million objects that were created sequentially. Unfortunately, I'm running Debian for a reason -- going back to compiling from source (as I do now) is a last resort.

(Since I'm using bdb, is the #define even relevant?)

The, uh, "good news" is that the numEntries I'm seeing for the "bad"
query is far below 8188, just 1111.  So perhaps this isn't an allids
problem?  For reference, the searches with numEntries:

uid=*222* : 29 seconds
# numEntries: 3700
uid=*222 : 0.063 seconds
# numEntries: 1000
uid=*2*22 : 0.14 seconds
# numEntries: 3439

And then, just for fun I did:

uid=*2 : 29 seconds
# numEntries: 100000
uid=*22 : 0.41 seconds
# numEntries: 10000

...So 10,000 entries can be returned off an index search, well over the
8188.  Is there another allids-like limit someplace?

It is quite clear in the docs that the default minimum substring indexing starts at 3 characters. So the "*2" and the "*22" substring searches will not be using the index at all unless you've tweaked this.


BTW, if you have your loglevel up to around 256, do you see this message?

bdb_substring_candidates: (uid) index_param failed (18)

That would indicate that BDB is not using the substring index for your query. However, I don't think this is really your issue.

On my 2.3.6 system "sn=*unt*" takes up 32% of my CPU and is extremely slow.

0.10u 0.06s 1:52.30 0.1%

with only 418 entries returned.

whereas "sn=*unt" takes 00.23 seconds and returns 142 entries.

Searching for "sn=*ount*" is also lightning fast.

So I'm guessing that "*XXX*" is one character short index wise. That may or may not be by design.

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html

"These censorship operations against schools and libraries are stronger
than ever in the present religio-political climate. They often focus on
fantasy and sf books, which foster that deadly enemy to bigotry and blind
faith, the imagination." -- Ursula K. Le Guin