[Date Prev][Date Next] [Chronological] [Thread] [Top]

IDL caching (ITS#2182)



Full_Name: Jong Hyuk Choi
Version: All
OS: RH 7.3
URL: ftp://ftp.openldap.org/incoming/jongchoi-IDL-021115.diff
Submission from: (NULL) (198.81.209.17)


I am suggesting an IDL cache that stores recently used IDLs
in bdb_idl_fetch_key() routine.

The rationale for the IDL cache came from a system profiling result
that I collected in the Linux OS using IBM tprof system profiler.
In the profiling result, HAM (hash access method) of Berkeley DB was reported
a significant portion of DB access, while BAM (btree access method) calls were
eliminiated by the directory entry cache.

The first approach I took was to implement a "candidate cache"
in bdb_search() routine, which stores recently used candidates of a search.
The candidate cache exists per base object in DIT and is keyed by search
filter.
It can successfully eliminate most HAM calls and boost performance (both
latency and throughput) by a lot. However, it has some disadvantages :
1) It is not easy to achieve an efficient invalidation of candidate cache
entries
   when there are updates (add,delete,modify,modrdn) to directory entries.
2) Although there is usually a small number of important base objects in DIT
and
   we can limit the candidate cache size per base object, the total amount of
   memory required for 100% caching is unbounded.
Nevertheless, it successfully improves the performance for search-only access
scenarios.

The second design is to implement an "IDL cache" in index DB access routines.
The IDL cache (logically) exists per index DB and is indexed by keystr and db.
The advantages of IDL cache are :
1) It is very efficient to invalidate an IDL cache entry when there is a change
   in the corresponding entry in DB. Invlidations are performed in 
   bdb_idl_insert_key() and bdb_idl_delete_key() routines.
2) The amount of memory for 100% caching is bounded.
Performance-wise, the IDL cache improves the search performance together with
search-during-update performance because elimination of HAM fetch reduces
interference of HAM fetch and update in mixed workload of search and update.
With the candidate cache, search-during-update performance was not
significanlty
improved because of the inefficient invalidation.
However, the IDL cache alone did not improve the search performance as well as
the candidate cache. In order to fill this gap, I will present a simple
slab allocator for IDL stack allocation as a separate ITS shortly.
Combined search performance improvement of IDL caching and IDL stack slab is on
par
with the candidate caching, and it also improves search-during-update
performance.

The patch for the IDL cache is on incoming directory of ftp.openldap.org.
Your reviews, comments, and suggestions are appreciated.

- Jong

---------------------------------------
Jong Hyuk Choi, Ph. D
Enterprise Linux Group
IBM Thomas J. Watson Research Center
jongchoi@us.ibm.com  jongchoi@OpenLDAP.org