[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Search speed regarding BDB_IDL_LOGN and order in search filter

To: Howard Chu <hyc@symas.com>
Subject: Re: Search speed regarding BDB_IDL_LOGN and order in search filter
From: Meike Stone <meike.stone@googlemail.com>
Date: Thu, 31 Jan 2013 19:04:53 +0100
Cc: openldap-technical@openldap.org
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=QPTBpCWN797X/4EvJuIddtnIH9E0DWPxiEVC8pyGWZc=; b=DRhVmF3DGroXLlb7UHY2uepmUVXa2jGsN2BfRwizC3d99bQDAti7/H3A4eIZRU4CrJ 9S9A8smVLWpso9iK+Q02xXDmEbzwEwm3qJG1cR3kmMWXOl+Cpv4AameWeD4qO3/oWqS2 Mmwkfz9jHMigLDQ88jVfWSk4/LU912wTmVIPyOLDRebXBzpO1cmWt0NqOgS94v5imM2f 09lfEpQ2Rimr/4nLsi7tUKgkB+2hlQtg4qUFvryAk1q5R+eWMGrLKTujlO7o65/dWFir W+uhohtnyMLFuSspaNH2YfFWvrOkuspJFT8wPC5adzUOlvAZ6+3M9makFgHQ4ynSuTeB Jm0Q==
In-reply-to: <510AA4A6.4010304@symas.com>
References: <CAFNHiA9jdwLi0qNTuUCvtLteca0_cXpycFAFYiciR_tDua30KA@mail.gmail.com> <510AA4A6.4010304@symas.com>

Hello Howard,

thanks for fast answer!

>> - An index slot is loosing precision if the search result for an
>> (indexed) attribute is larger than 2^16. Then the search time is going
>> to increase a lot.
>> - I can change this via BDB_IDL_LOGN.
>> - But if I have a directory, that holds 200.000 employees with
>> '(ObjectClass=Employees)', the result is larger than 2^16 and it is
>> slow.
>> - lets say, the employees are distributed over 4 continents and the
>> DIT is structured geographical eg.:
>>
>>       o=myOrg, c=us (100,000 employees)
>>       o=myOrg, c=gb ( 30,000  employees)
>>       o=myOrg, c=de ( 25,000 employees)
>>       o=myOrg, c=br  ( 45,000 employees)
>>
>> Can i prevent it this problem with index slot size, if I change the
>> search base to "o=myOrg, c=gb", because there are only 30,000
>> employees.
>
>
> Try it and see.
>
> If you're already playing with low-level definitions in the source code, you
> have no need for us to answer these questions. Or, if you need us to answer
> these questions, you have no business playing with low-level definitions in
> the source code.
>

I'm the Linux admin of the systems and I'm responsible for all the
services like ldap, mysql, ...
In our company we have a few programer, how cares for the data for our
ldap-server. Suddenly, the slapd was slow and I found the solution
here:
http://www.openldap.org/lists/openldap-technical/201101/msg00102.html
I and increased BDB_IDL_LOGN.

But now, month later - answers are slow again.
So I wrote a small perl script, who searches for the "longtimers"  in
the log file (with loglevel 256).
And I see a lot off searches takes a long time (until 500s) and I want
to understand why.

I'll try to understand, how all the internal stuff works, but it is
hard for an non c programmer like me. A book about all of this would
be very appreciated.
I understand, how the ldap operation works, but not how they are
implemented and how must I care, that hey work fast.
Our programer says only: "That's my question (ldap search), it is
"well formed", why do I must wait so long for answer, didn't we build
a index for all needed attributes?"

>
>> This takes me to the second question:
>>
>> "How is a search filter evaluated?"
>>
>> Lets say, I combine  three filter via "and" like
>> '(&(objectlass=Employees)(age=30)(sex=m))' and the all attributes are
>> indexed. Each filter results:
>> (objectlass=Employees) => 200,000 entries
>> (age=30)  => 10,000 entries
>> (sex=m)    => 3,000 entries
>>
>> Does the order matter regarding speed, is it better to form the filter
>> like this?
>>   '(&(sex=m)(age=30)(objectlass=Employees))'
>
>
> Try it and see.
>
> If you have advance knowledge of the characteristics of your data, perhaps
> you can optimize the filter order. In most cases, your applications will not
> have such knowledge, or it will be irrelevant. For example, if you have a
> filter term that matches zero entries, it is beneficial to evaluate that
> first in an AND clause. But it would make no difference in an OR clause.

Ok, thanks a lot

kindly regards, Meike

References:
- Search speed regarding BDB_IDL_LOGN and order in search filter
  - From: Meike Stone <meike.stone@googlemail.com>
- Re: Search speed regarding BDB_IDL_LOGN and order in search filter
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: server-side sub-search
Next by Date: Re: server-side sub-search
Index(es):
- Chronological
- Thread