[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3611) Index clustering patch for fast slapadd

jongchoi@us.ibm.com wrote:

>Full_Name: Jong-Hyuk Choi
>Version: HEAD
>URL: ftp://ftp.openldap.org/incoming/fastslapadd1.diff
>Submission from: (NULL) (
>This diff is the first cut of the index clustering patch for faster directory
>population which I discussed with Howard a few weeks ago. 
>The index clustering is to accelerate slapadd based on the observation that IDs
>are added in an ascending order during slapadd. Instead of writing to the index
>database upon addition of every single ID, slapadd buffers multiple IDs to make
>an ID cluster and writes the cluster as a single unit.
>An IDL for an index hash will look like the following:
>[Non Range IDL]
>Header consisting of NOID and # of IDs in the IDL
>ID cluster 1 (IDC1_1, IDC1_2, IDC1_3 ..... IDC1_n)
>ID cluster 2 (IDC2_1, IDC2_2, IDC2_3 ..... IDC2_n)
>ID cluster 3
>ID cluster N
>[Range IDL]
>0 lo hi
>The header and ID clusters are keyed with the same hash value and they are
>sorted in an ascending order (NOID is treated as the smallest in the dup
>comparison function). As a result, the number of writes to the index databases
>can be reduced by the factor of the cluster size. For example, if the cluster
>size is 1024, the number of writes to the index database can be reduced by up to
>1024 and this significantly improves the performance of slapadd.
>For non-tool mode addition (ldapadd), if the currently added ID is the greatest
>of all IDs in the IDL, it is added as a single ID stored as the end data for the
>index key. If the currently added ID is not the greatest, then it is added into
>the ID cluster it belongs (where lo <= ID <= hi holds). If the size of the
>cluster becomes larger than the max cluster size, it is split into half. To
>delete an ID from an IDL, the corresponding ID cluster need be located. The
>cluster will be read, modified, and written in order to remove the ID from the
>The current code passed all test scripts except test026. I didn't look at the
>test026 case further since HEAD didn't pass it either. I'm currently evaluating
>its performance while optimizing loose ends in terms of performance. I
>appreciate any comments to the patch.
Why are you explicitly storing the number of IDs in the IDL, when the 
BDB library already maintains that counter itself?

The use of a custom bdb_dup_compare function breaks compatibility with 
the BDB db_dump/load utilities. Since I just went to the effort of 
eliminating the old uses of these custom functions in order to regain 
compatibility, I'd prefer that we don't reintroduce this problem.

Also, your compare function assigns the incoming data directly to 
int-sized variables. The BDB documentation explicitly states that this 
should not be done as there are no alignment guarantees on the incoming 
data. It's a wasted step anyway since I wrote the BDB_DISK2ID macros to 
work correctly using unaligned input data, you should just be using 
BDB_DISK2ID on the incoming data and not copying to intermediate 
variables first.

Overall it seems to me that the net result of this patch is equivalent 
to making the current IDL cache a writeback cache with a predefined 
flush/checkpoint counter. Why not just do that? It would be simpler to 
enable the IDL cache in tool mode, and avoid the large amount of code 
duplication present in the current patch.

The majority of the patch is to back-bdb, but there's also a back-ldbm 
patch and a glue patch. If the back-bdb patch makes changes that are so 
visible outside of back-bdb, then there are probably other backends that 
will be affected, which worries me.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support