[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: need suggestion on indexing big directory





--On Monday, June 07, 2004 10:02 PM +0200 vadim tarassov <vadim.tarassov@swissonline.ch> wrote:

Hi Quanah,

Keeping in mind your suggestions and documentation you've provided I
successfully added all my data. Thanx a lot indeed.

You know what I've found in addition? I am afraid there is no
understanding of what hardware one should use to deploy a directory, or
at least I did not have any. Funny isn't it? With this I would like to
say that population of my berkeley backend on sun 480 connected to ESS
took about 1 hour whereas the same operation on sun 100 with usual
harddisk takes about 50 hours.

It could be helpful for others if we could collect simple some data to
show people what performance should they expect choosing this or that
hardware ....

In my case as I have mentioned already it was 80000 entries with ca. 50
attributes, all of them indexed, most of them indexed like pres,eq,sub,
few of them like pres,eq,sub,approx, the rest just pres,eq or even only
pres.

Thanx a lot again and best regards, vadim tarassov.

Our 300-330k DB takes about 2 hours to index 67 or so attributes. They are running on Sun Fire V120's (650 Mhz single cpu), and have 4 GB of RAM (2 GB memory cache for BDB). The things that impact doing a slapcat laod the most, are the log file seperation (although that isn't an issue with BDB 4.2.52, since you can turn log file generation off while doing a load), and the size of memory cache you allocate.


I'll note that the same load on a 2 CPU 2.4GHz debian linux system took all of 15 minutes (with a 1.5 GB memory cache for BDB)..

So there's a lot to be said for CPU power. ;)

--Quanah

--
Quanah Gibson-Mount
Principal Software Developer
ITSS/Shared Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html