[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: back-bdb IDL limitations



Jonghyuk Choi wrote:


During the performance evaluation and testing of the LDAP component matching in OpenLDAP, we developed an ldif split program written in C. It was a lot faster than doing it by scripting.

It would probably be good just to add an option to slapadd to start at the Nth entry, no reason to require another tool.


For the second point, isn't it possible to figure out what the dn of the last committed entry is in a consistent database by retrieving the last ID and the corresponding entry in the id2entry database ?

Sure, when you know that it's consistent. With transactions turned off, there's no way to know if the dn2id and other attribute indices are consistent. I suppose that can be fixed by running slapindex, but slapindex has its own problems.


By the "incremental slapadd", I tried not to rule out a situation where the database is populated incrementally for example day by day by using slapadd. If there's a system failure during the current slapadd, it is possible with transactions to avoid the repopulation of the entire directory which was not populated in the current slapadd run. It might not be the common case scenario but it shows that it is desirbale to maintain the transactional operation of slapadd and I guess that's why the new quick slaptools mode is activated by a command line option.

OK, now I understand. Yes...


- Jong-Hyuk

------------------------
Jong Hyuk Choi
IBM Thomas J. Watson Research Center - Enterprise Linux Group
P. O. Box 218, Yorktown Heights, NY 10598
email: jongchoi@us.ibm.com
(phone) 914-945-3979    (fax) 914-945-4425   TL: 862-3979




*Howard Chu <hyc@symas.com>*

03/04/2005 01:13 PM

To: Jonghyuk Choi/Watson/IBM@IBMUS
cc: David Boreham <david_list@boreham.org>, openldap-devel@OpenLDAP.org
Subject: Re: back-bdb IDL limitations





We've tested ranges from 2 million to 20 million entries, and it seems rather inconvenient to do incremental adds here. First we need to add a flag to slapadd to tell it to skip the first N entries and start adding from some offset into the input LDIF, because manually splitting such a large input LDIF file is a pain. Second we need to add feedback from slapadd to tell you exactly where it was in the input when it failed (entry number may be enough, byte offset would probably be better) but of course, if the add is interrupted because the entire machine crashed, you may never see that feedback. I guess one question to answer is what kind of failures are you protecting against, and what does it cost either way. For the most part, it is simpler to just wipe out the database and start over.

Jonghyuk Choi wrote:

>
> Is it so because of the recovery overhead can be huge ? Even so,
> transactions seem to be indispensable to situations like incremental
> slapadd.
> - Jong-Hyuk
>
> ------------------------
> Jong Hyuk Choi
> IBM Thomas J. Watson Research Center - Enterprise Linux Group
> P. O. Box 218, Yorktown Heights, NY 10598
> email: jongchoi@us.ibm.com
> (phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979
>
>
>
>
> *"David Boreham" <david_list@boreham.org>*
> Sent by: owner-openldap-devel@OpenLDAP.org
>
> 03/04/2005 12:06 PM
>
> > To: <openldap-devel@OpenLDAP.org>
> cc: > Subject: Re: back-bdb IDL limitations
>
>
>
>
> > 2) there should be non-zero need for transaction protected operations
> of slap tools to cope with a system failure during a lengthy directory
> population.
> I think you will find that it's almost always faster to simply disable
> transactions and
> re-start the slapadd in the event of a failure.
> >




--
 -- Howard Chu
 Chief Architect, Symas Corp.       Director, Highland Sun
 http://www.symas.com               http://highlandsun.com/hyc
 Symas: Premier OpenSource Development and Support




--
 -- Howard Chu
 Chief Architect, Symas Corp.       Director, Highland Sun
 http://www.symas.com               http://highlandsun.com/hyc
 Symas: Premier OpenSource Development and Support