[Date Prev][Date Next] [Chronological] [Thread] [Top]

Faster updates

Hello, List!

Asking for some advice on a strategy for speeding up batch updates to a directory.

If you think my whole approach is "challenged" please feel free to provide advice on that subject, too.

But first, a request: I'm only a very amateur techie, so be gentle.


I'm running an OpenLDAP directory (installed by someone else) that's accessed via a Web page (so the LDAP client is the appserver, and that's Tomcat. ) I'm programming in Java, using the JNDI API.

The basic requirement is to maintain a directory fed by batch (FTP) updates from some 20 completely diffferent organizations. The data originate typically in Exchange, Notes or various DBMSes. To minimize requirements for these orgs, we've set a policy that they send complete replacement files for their orgs., in LDIF format. Most people seem to be able to generate these from their systems without programming. We ask only for PERSON entries, so we're responsible for building appropriate parents (and removing sub-orgs that become empty, eg via name changes or re-organization). We're also responsible for determining what's an add, delete or modify.

Given that, I've coded an update program that:
(1) builds a "temporg" subtree in my directory from each new data file (including required parents.)
(2) for each entry in this tree, looks for a matching entry in the "permanent" tree. If not found, then add entry (and parent(s) if required.)
(3) If found, but attributes don.t match (! oldBasicAttributes.equals(newBasicAttributes)), replace old attributes.
(4) Then, recursing through the permanent subtree for the updating org, look for a matching entry in the temp tree. If not match, then delete the entry (children first, taking advantage of the recursive approach of this method.)
(5) Finally, delete the temp subtree.

Works great (mostly <g>), but takes approximately forever.

So, I was thinking, maybe I could maintain a separate database for the temp subtree, and then (a) just delete it to save time on step (5); (b) take it offline and batch build it with slapadd (with some mods to my program, to be sure!), or (c) both.

Also, can I assume that slapadd is LOTS faster than using the JNDI API? I guess I'd give up the relatively flexible error-handling I have now, right? Entries that failed for any reason would just fail and have to be actually looked at.

Anyhow:  your advice appreciated.



PS--Why not just do a complete rebuild for every update rather than chack for A/D/M? Good question. It would certainly be faster. Only reason is to preserve the option to have self-updateable attributes in the database (favorite drink, as opposed to salary.)