[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: reqStart<= slow



Michael StrÃder wrote:
Howard Chu wrote:
Michael StrÃder wrote:
HI!

I'm trying to retrieve change events from accesslog DB (all with today's
RE24). I tried searching with this filter:

(&(reqDN=cn=Test-Mail-Gruppe 1,dc=example,dc=com)(reqStart<=20120413180000Z))

This turned out to be quite slow though. reqDN is indexed and there are only
two possible entries. Using a filter reqStart>= even when negated with (!())
is pretty fast.

Is reqStart indexed?

Yes, eq-indexed of course.

I really wonder why that is.

Obviously, for any database that has been around for even a short while, there
will be far fewer records with a date newer than [today's date] as opposed to
older than then.

But there were only three entries matching
(reqDN=cn=Test-Mail-Gruppe 1,dc=example,dc=com)
anyway and reqDN is also eq-indexed and finding them with this filter is
pretty fast.

Some more examples where reqDN-index is obviously not used but reqStart-index
should be used in both cases:

Quite fast although I would have expected an significant slow down because of
negation filter:
(&(reqDN:dnSubtreeMatch:=ou=Groups,dc=example,dc=com)(reqStart>=20120313072338Z)(!(reqStart>=20120413075657Z)))

Range lookups are expensive, even when fully indexed, but negations bypass all index lookups, and are simply replaced with (All IDs). Since this is an AND filter, that result is essentially a no-op and costs nothing.

Almost identical but very slow compared to the example above:
(&(reqDN:dnSubtreeMatch:=ou=Groups,dc=example,dc=com)(reqStart>=20120313072338Z)(reqStart<=20120413075657Z))

I can't explain this based on index configuration. Maybe there's something
handled differently with<= compared to>=?

They are handled identically, it is simply the difference in number of records that need to be read.

On an indexed <= lookup, the backend reads the Equality index of the given attribute, starting at the beginning, and adding every entryID to the candidate list, until it reaches the end time. On an indexed >= lookup, it reads from the specified timestamp to the end of the index.

Again, obviously when you use [today's date] the >= lookup will be much faster than the <= lookup.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/