[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Enhancement to slapindex (ITS#457)



Just a quick note on this issue.  Mark's idea of indexing
multiple attributes in one pass is sound.  Though we likely
not apply the patch immediately due to development considerations
(the indexing code is being reworked), we will revisit this
later (before 2.0 release).

	Kurt

At 11:25 PM 2/21/00 GMT, Kurt@OpenLDAP.org wrote:
>At 10:41 PM 2/21/00 GMT, kunkee@neosoft.com wrote:
>>> Full_Name: Mark Adamson
>>> Version: devel
>>> OS: Solaris
>>> URL: http://nil.andrew.cmu.edu/openldap/multislapindex
>>> Submission from: (NULL) (128.2.122.223)
>>> 
>>> 
>>> I was using the "slapindex" program to create new index files for my LDBM
>>> backended
>>> database. It took about 1.5 hrs to run, to reindex a single attribute. I had
>>> several
>>> attributes that I wanted to reindex, and it seemed absurd to do this one at a
>>> time;
>
>You can call slapindex multiple times.  Wes posted a simple script
>for 1.2.9 which does this.  Will work with slapindex with only
>minor changes.
>
>>> the slapindex program should be able to reindex multiple attributes in a single
>>> run.
>
>This is of insignificant gain.  To get significant gain, you need
>index attributes in parallel.
>
>>> I made changes to the latest (devel) build of slapindex to allow this, it was
>>> fairly
>>> trivial. I ran it, giving it 6 attributes to index at once. It took under 2 hrs
>>> to
>>> run, instead of the 9 hrs it would have taken to run them serially.
>
>As Randy pointed out, this is likely based upon some false assumptions.
>Your code still creates indexes serially, I would doubt your code
>runs significantly faster than six separate processes ran serially.
>
>If you want significant improvement, you need to run these
>processes concurrently in an environment which allows for
>some portion of process to executed in parallel (such as
>when one process is waiting for I/O).
>
>>This is a great idea.
>
>Well, in parallel indexing processing is an okay idea.  But
>management and error handling are a bit of a pain.  I would
>rather avoid the adding in parallel index processing complexity
>until we sort out other, more critical 2.0 indexing issues.
>I would rather not add multiple (serial) index processing as
>this would interfer with later development of multiple parallel
>index generation.
>
>Kurt
>
>