Issue 457 - Enhancement to slapindex
Summary: Enhancement to slapindex
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2000-02-18 21:55 UTC by Mark Adamson
Modified: 2014-08-01 21:05 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Mark Adamson 2000-02-18 21:55:07 UTC
Full_Name: Mark Adamson
Version: devel
OS: Solaris
URL: http://nil.andrew.cmu.edu/openldap/multislapindex
Submission from: (NULL) (128.2.122.223)


I was using the "slapindex" program to create new index files for my LDBM
backended
database. It took about 1.5 hrs to run, to reindex a single attribute. I had
several
attributes that I wanted to reindex, and it seemed absurd to do this one at a
time;
the slapindex program should be able to reindex multiple attributes in a single
run.

I made changes to the latest (devel) build of slapindex to allow this, it was
fairly
trivial. I ran it, giving it 6 attributes to index at once. It took under 2 hrs
to
run, instead of the 9 hrs it would have taken to run them serially.

As per the patch submission guidelines, a "diff -luN" of the changes is
available at

	http://nil.andrew.cmu.edu/openldap/multislapindex


Share & Enjoy.


-Mark Adamson
 Carnegie Mellon University

Comment 1 Kurt Zeilenga 2000-02-19 09:45:39 UTC
moved from Incoming to Software Enhancements
Comment 2 kunkee@openldap.org 2000-02-21 22:46:16 UTC
> Full_Name: Mark Adamson
> Version: devel
> OS: Solaris
> URL: http://nil.andrew.cmu.edu/openldap/multislapindex
> Submission from: (NULL) (128.2.122.223)
> 
> 
> I was using the "slapindex" program to create new index files for my LDBM
> backended
> database. It took about 1.5 hrs to run, to reindex a single attribute. I had
> several
> attributes that I wanted to reindex, and it seemed absurd to do this one at a
> time;
> the slapindex program should be able to reindex multiple attributes in a single
> run.
> 
> I made changes to the latest (devel) build of slapindex to allow this, it was
> fairly
> trivial. I ran it, giving it 6 attributes to index at once. It took under 2 hrs
> to
> run, instead of the 9 hrs it would have taken to run them serially.
> 
> As per the patch submission guidelines, a "diff -luN" of the changes is
> available at
> 
> 	http://nil.andrew.cmu.edu/openldap/multislapindex
> 
> 
> Share & Enjoy.
> 
> 
> -Mark Adamson
>  Carnegie Mellon University
> 
> 
> 
> 

This is a great idea, but I think you may be incorrect in your assumption
(unless you actually timed it) that the indexing of all 6 attributes would
take 6x the time of the first indexing.  Substring (including dn) indexing
takes far longer than those which are simply exact, or existence.

Randy Kunkee
NeoSoft Inc.
Comment 3 Kurt Zeilenga 2000-02-21 23:25:50 UTC
At 10:41 PM 2/21/00 GMT, kunkee@neosoft.com wrote:
>> Full_Name: Mark Adamson
>> Version: devel
>> OS: Solaris
>> URL: http://nil.andrew.cmu.edu/openldap/multislapindex
>> Submission from: (NULL) (128.2.122.223)
>> 
>> 
>> I was using the "slapindex" program to create new index files for my LDBM
>> backended
>> database. It took about 1.5 hrs to run, to reindex a single attribute. I had
>> several
>> attributes that I wanted to reindex, and it seemed absurd to do this one at a
>> time;

You can call slapindex multiple times.  Wes posted a simple script
for 1.2.9 which does this.  Will work with slapindex with only
minor changes.

>> the slapindex program should be able to reindex multiple attributes in a single
>> run.

This is of insignificant gain.  To get significant gain, you need
index attributes in parallel.

>> I made changes to the latest (devel) build of slapindex to allow this, it was
>> fairly
>> trivial. I ran it, giving it 6 attributes to index at once. It took under 2 hrs
>> to
>> run, instead of the 9 hrs it would have taken to run them serially.

As Randy pointed out, this is likely based upon some false assumptions.
Your code still creates indexes serially, I would doubt your code
runs significantly faster than six separate processes ran serially.

If you want significant improvement, you need to run these
processes concurrently in an environment which allows for
some portion of process to executed in parallel (such as
when one process is waiting for I/O).

>This is a great idea.

Well, in parallel indexing processing is an okay idea.  But
management and error handling are a bit of a pain.  I would
rather avoid the adding in parallel index processing complexity
until we sort out other, more critical 2.0 indexing issues.
I would rather not add multiple (serial) index processing as
this would interfer with later development of multiple parallel
index generation.

Kurt
Comment 4 Kurt Zeilenga 2000-02-22 16:52:25 UTC
Just a quick note on this issue.  Mark's idea of indexing
multiple attributes in one pass is sound.  Though we likely
not apply the patch immediately due to development considerations
(the indexing code is being reworked), we will revisit this
later (before 2.0 release).

	Kurt

At 11:25 PM 2/21/00 GMT, Kurt@OpenLDAP.org wrote:
>At 10:41 PM 2/21/00 GMT, kunkee@neosoft.com wrote:
>>> Full_Name: Mark Adamson
>>> Version: devel
>>> OS: Solaris
>>> URL: http://nil.andrew.cmu.edu/openldap/multislapindex
>>> Submission from: (NULL) (128.2.122.223)
>>> 
>>> 
>>> I was using the "slapindex" program to create new index files for my LDBM
>>> backended
>>> database. It took about 1.5 hrs to run, to reindex a single attribute. I had
>>> several
>>> attributes that I wanted to reindex, and it seemed absurd to do this one at a
>>> time;
>
>You can call slapindex multiple times.  Wes posted a simple script
>for 1.2.9 which does this.  Will work with slapindex with only
>minor changes.
>
>>> the slapindex program should be able to reindex multiple attributes in a single
>>> run.
>
>This is of insignificant gain.  To get significant gain, you need
>index attributes in parallel.
>
>>> I made changes to the latest (devel) build of slapindex to allow this, it was
>>> fairly
>>> trivial. I ran it, giving it 6 attributes to index at once. It took under 2 hrs
>>> to
>>> run, instead of the 9 hrs it would have taken to run them serially.
>
>As Randy pointed out, this is likely based upon some false assumptions.
>Your code still creates indexes serially, I would doubt your code
>runs significantly faster than six separate processes ran serially.
>
>If you want significant improvement, you need to run these
>processes concurrently in an environment which allows for
>some portion of process to executed in parallel (such as
>when one process is waiting for I/O).
>
>>This is a great idea.
>
>Well, in parallel indexing processing is an okay idea.  But
>management and error handling are a bit of a pain.  I would
>rather avoid the adding in parallel index processing complexity
>until we sort out other, more critical 2.0 indexing issues.
>I would rather not add multiple (serial) index processing as
>this would interfer with later development of multiple parallel
>index generation.
>
>Kurt
>
>
Comment 5 Kurt Zeilenga 2000-03-10 15:44:02 UTC
changed notes
changed state Open to Suspended
Comment 6 Kurt Zeilenga 2000-05-25 16:30:42 UTC
moved from Software Enhancements to Development
Comment 7 Kurt Zeilenga 2000-07-18 17:52:46 UTC
The new slapindex will regenerates all indices for all attributes
of all entries within the database.  Your contribution, though
appreciated, is no longer relevant.  (Though new patches providing
selective indexing might be).

Again, thanks.
  Kurt
Comment 8 Kurt Zeilenga 2000-07-18 17:53:04 UTC
changed state Suspended to Closed
Comment 9 Kurt Zeilenga 2000-07-19 00:52:45 UTC
The new slapindex will regenerates all indices for all attributes
of all entries within the database.  Your contribution, though
appreciated, is no longer relevant.  (Though new patches providing
selective indexing might be).

Again, thanks.
  Kurt
Comment 10 OpenLDAP project 2014-08-01 21:05:27 UTC
Pending index changes