[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: EntryInfo cache size....



> I understand your point. But I think I need to reiterate - the EntryInfo
> cache is a tree structure, you cannot delete it in arbitrary order the way
> you have done for the Entry cache. Regardless of the order that you
request
> to search it, the branch leading from the root to the desired Entry must
be
> present. You cannot delete interior nodes of the tree, you can only delete
> leaf nodes. So if your access pattern visits interior nodes first, there
is
> no advantage to un-cache them immediately after use, before hitting the
next
> node; the nodes would be recreated again as soon as you look for the
> children.
>
> I wonder what kind of memory constraints you envision. For a 1 million
entry
> directory the EntryInfo would consume probably 80-100MB. I think it's fair
to
> expect that a machine serving a million entries would have at least
several
> hundred MB of RAM available.

Because EntryInfo cache impacts less than the entry cache or dbcache,
for now it seems fine to support only the entry cache bypass ( some more
works to do though ;) )

More important issue seems to be the dbcache. The (virtual) size of the
dbcache is determined at the
ENV open time and cannot be changed afterwards. Also, it seems impossible to
bypass the dbcache unless there's an api for this I'm not aware of. (there
is ?)
The problem is, to reiterate, the dbcache can be polluted by a large
syncrepl session,
so as to make its working set unnecessarily large.

A solution that I'm currently thinking of is to make another database,
namely, id2UUID database,
and make a path to perform internal search on it instead of the id2entry
database.
The size of id2UUID database would be much smaller (less than 1/10 of the
id2entry).
Also the working set coming from syncrepl's index db or dn2id db accesses
would be small in general.

Comments ?

> The previous EntryInfo code was LRU driven but because it is only allowed
to
> delete leaf nodes, it really didn't help much. The EntryInfo replacement
> pattern differs from the Entry pattern, and you can't keep them in
lockstep
> without breaking the hierarchy, which would defeat the purpose of the
> back-hdb design.
>
> An alternative would be to restructure the way Searches work: instead of
> walking a flat IDL of candidates, process candidates in tree order. I've
> thought about this quite a bit; it may make some other issues easier to
> handle but it makes indexing much harder to use.
>