[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: updating a tree from a full dataset



Bjorn Nordbo wrote...:
> Daniel Tiefnig wrote:
>> Bjorn Nordbo wrote...:
>> 
>> > I have a program which reads a bunch of router configs, extracts
>> > a lot of information and writes it to a file database every
>> > hour. To make the world a better place to live, I'd like to
>> > store this infor- mation in an LDAP directory instead. However,
>>
> The characteristics are:
> 
> - add/modify/delete every hour (approx 50 ops/day)

yes, that seems ok to me. we're getting up to about 2000 modifications 
per day and approx. 10M reads without any problem. update interval is 
30minutes in our case, 700k entries, ~10 attributes. running on two sp2 
silver nodes, though.. :o) (50% idle)

> - applications reading data    (approx 1M ops/day)
> - entries in the database this year:   30.000
> - entries in the database next year: ~160.000
> - attributes per entry: 13
> 
> (I was completely off on my previous estimate of the number of
> entries) 
> 
>> > The approach which first comes to mind is to retrieve the entire
>> > branch from LDAP, compare this to the data recently extracted
>> > from the router configs and write only changed entries back to
>> > LDAP. 
>> 
>> you may retrieve entry by entry from LDAP and use ldapmodify to
>> write back attributes/values that changed.. improves update speed 
>> significantly if there aren't much changes in the router-configs,
>> and even a bit (or bunch) if there are.. 
> 
> Mm, I like that a lot better (programming wise) than my previous
> retrieve-everything-then-compare approach. But how will this
> improve the performance?

mmm, first, when you're just updating one attribute of an entry, this 
will be faster than deleting the whole entry, and re-adding it. besides, 
assuming you're indexing one ore the other attribute, you'll maybe not 
have to recompile all indexes, if you're just modifying one (maybe even 
not indexed) attribute.
also note, that retrieving everything from LDAP at once will be faster 
than doing several "single" searches. (but i think that'll make your 
script ugly :o)

hth,
daniel