[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: Proxycache Documentation - again , . . . .



>> -----Original Message-----
>> From: owner-openldap-devel@OpenLDAP.org
>> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Pierangelo
>> Masarati
>
>> > An important design consideration was to reduce the cost of query
>> containment. Proxy cache is designed to improve performance
>> for a subset
>> > of queries which belong to specified templates. Thus containment is
>> checked between queries of the same template. Apart from
>> reducing the
>> > number of filter and attribute set comparisons, it greatly
>> simplifies
>> > filter containment (at least for conjunctive queries). This
>> does result
>> > in slightly reduced cache answerability.
>>
>> Apurva,
>>
>> I clearly see your point, but Reinhard's consideration
>> is really intuitive.  I'm working on query templating
>> for a totally unrelated issue, but I'll try to get
>> inspiration from your code.  I think the possibiity to
>> recognize sub-contained queries should be investigated.
>>
>> I'll have a look at it.
>
> There are definite tradeoffs to consider here. The actual usefulness of
> such a change is limited, given the constraints on the cacheability
> determination. If you keep the current cacheability configuration, then
> only a small percentage of your queries will keep the cache up to date.
> The information will age out despite the other queries that are
> answerable from it.
>
> If you go to a more free format cacheability decision, (i.e., cache any
> queries that match the filter template and any subset of these
> attributes) then determining answerability becomes much more expensive.
>
> Consider - if you use an attrset with 20 attributes, then somebody must
> make a query requesting all 20 of those attributes before any query data
> enters the cache, and somebody must repeat this query on a regular basis
> to keep the cache fresh (since we don't use syncrepl yet to refresh it).
> All the other queries that only want one or two attributes can only be
> satisfied after the 20-attr query has been done.
>
> If you change the nature of the cache such that querying for any subset
> of the 20 attributes will update the cache, then you have the complex
> job of tracking which individual cached queries can answer a given
> request. You can reduce the number of comparisons by using sorted attr
> sets and other such tricks, but you have to inspect every cached query
> in the database. (You could probably index it if you limit attrsets to
> 32 or less, and use a bitmap to represent the attrs present in each
> query.)

I wouldn't go that far.  If a subset of the cached filter matches, and the
requested attributes are in the cache, I would use the cache to answer
that request, without refreshing it and without making further
speculations.  A subset query would benefit from the cache at a limited
cost (filter template comparison AND filtering values comparison) without
affecting further cache performances.  It would be a "parasite" query with
respect to caching, but with limited overhead; it's a pity not to use
local info when it's there.  Of course it's a matter of evaluating what is
going to be the impact of this change in terms of performances.

p.

-- 
Pierangelo Masarati
mailto:pierangelo.masarati@sys-net.it