[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#3611) Index clustering patch for fast slapadd
jongchoi@us.ibm.com wrote:
>Full_Name: Jong-Hyuk Choi
>Version: HEAD
>OS: SLES9
>URL: ftp://ftp.openldap.org/incoming/fastslapadd1.diff
>Submission from: (NULL) (66.90.4.6)
>
>
>This diff is the first cut of the index clustering patch for faster directory
>population which I discussed with Howard a few weeks ago.
>
>The index clustering is to accelerate slapadd based on the observation that IDs
>are added in an ascending order during slapadd. Instead of writing to the index
>database upon addition of every single ID, slapadd buffers multiple IDs to make
>an ID cluster and writes the cluster as a single unit.
>
>An IDL for an index hash will look like the following:
>
>[Non Range IDL]
>Header consisting of NOID and # of IDs in the IDL
>ID cluster 1 (IDC1_1, IDC1_2, IDC1_3 ..... IDC1_n)
>ID cluster 2 (IDC2_1, IDC2_2, IDC2_3 ..... IDC2_n)
>ID cluster 3
>...
>ID cluster N
>
>[Range IDL]
>0 lo hi
>
>The header and ID clusters are keyed with the same hash value and they are
>sorted in an ascending order (NOID is treated as the smallest in the dup
>comparison function). As a result, the number of writes to the index databases
>can be reduced by the factor of the cluster size. For example, if the cluster
>size is 1024, the number of writes to the index database can be reduced by up to
>1024 and this significantly improves the performance of slapadd.
>
>For non-tool mode addition (ldapadd), if the currently added ID is the greatest
>of all IDs in the IDL, it is added as a single ID stored as the end data for the
>index key. If the currently added ID is not the greatest, then it is added into
>the ID cluster it belongs (where lo <= ID <= hi holds). If the size of the
>cluster becomes larger than the max cluster size, it is split into half. To
>delete an ID from an IDL, the corresponding ID cluster need be located. The
>cluster will be read, modified, and written in order to remove the ID from the
>cluster.
>
>The current code passed all test scripts except test026. I didn't look at the
>test026 case further since HEAD didn't pass it either. I'm currently evaluating
>its performance while optimizing loose ends in terms of performance. I
>appreciate any comments to the patch.
>
Why are you explicitly storing the number of IDs in the IDL, when the
BDB library already maintains that counter itself?
The use of a custom bdb_dup_compare function breaks compatibility with
the BDB db_dump/load utilities. Since I just went to the effort of
eliminating the old uses of these custom functions in order to regain
compatibility, I'd prefer that we don't reintroduce this problem.
Also, your compare function assigns the incoming data directly to
int-sized variables. The BDB documentation explicitly states that this
should not be done as there are no alignment guarantees on the incoming
data. It's a wasted step anyway since I wrote the BDB_DISK2ID macros to
work correctly using unaligned input data, you should just be using
BDB_DISK2ID on the incoming data and not copying to intermediate
variables first.
Overall it seems to me that the net result of this patch is equivalent
to making the current IDL cache a writeback cache with a predefined
flush/checkpoint counter. Why not just do that? It would be simpler to
enable the IDL cache in tool mode, and avoid the large amount of code
duplication present in the current patch.
The majority of the patch is to back-bdb, but there's also a back-ldbm
patch and a glue patch. If the back-bdb patch makes changes that are so
visible outside of back-bdb, then there are probably other backends that
will be affected, which worries me.
--
-- Howard Chu
Chief Architect, Symas Corp. Director, Highland Sun
http://www.symas.com http://highlandsun.com/hyc
Symas: Premier OpenSource Development and Support