[Date Prev][Date Next] [Chronological] [Thread] [Top]

Odd performance problem with substring index



I have an odd problem with OpenLDAP 2.0.7 that I'd like some input on.


I'm setting up a catalog with approximately a million nodes in it.  My
problem is that the performance of a substring index seems half arbitrary,
dependent on where in the search string I put globs (*) -- _sometimes_
OpenLDAP seems to do a linear search, even though the attribute searched on
is indexed.


To be more concrete, the relevant part of my catalog structure is like
this:

dc=com
  dc=domain1,dc=com
  dc=domain2,dc=com
    uid=foo,dc=domain2,dc=com
     - uid: foo
     - mail: foo@domain2.com
     - ...

Now, I have eq and sub indexes on the mail attribute, so I expect the
following searches to return the correct result pretty quickly:

1. (mail=foo@domain2.com)
2. (mail=*foo@domain2.com*)
3. (mail=*foo@domain2.com)
4. (mail=foo@domain2.com*)
5. (mail=foo@doma*in2.com)
6. (mail=foo@*domain2.com)
7. (mail=f*oo@domain2.com)

My observation is that only searches 1 and 2 respond quickly (<0.1sec),
while the others use several minutes, pretty much the same time as a search
on a non-indexed attribute.

However, if I add another glob to searches 5-7, I can get better
performance.  The following perform quickly:

(mail=*foo@domain2*.com)
(mail=*foo@doma*in2.com)
(mail=f*oo@domain2.com*)

...while the following are very slow:

(mail=*foo*@domain2.com)
(mail=foo@domain2*.com*)


Is it a known issue?  If so: is there a general work-around I can rely on?


Apart from this issue, I'm happy with the performance of OpenLDAP with a
million nodes.  Even the write performance, with a couple of indexes and
with the database files on a RAID5 volume, is pretty good.

-- 
   Øyvind Møll
   oyvindmo@initio.no
   http://www.initio.no/