[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: indexing & hardware questions -- again



>Hi, I sent these questions in last week, but haven't received any
>responses. I would appreciate hearing your thoughts about any of them.
>We have to buy the LDAP servers soon, and you understand that I can't
>afford to choose the wrong machines.
>Could somebody explain what information is put in the index files when
>substring indexes are generated? Or, more generally, what OpenLDAP's
>indexing strategy is?

I don't know the answer to that one.

>I'm wondering whether adding substring indices actually hurts the
>performance of other indices (aside from the usual costs associated
>with maintaining more information).

I've tested this and found it seemed to have no effect.  Searching only seems to
use the appropriate index.

>I'm also concerned about the sheer size of the index files -- generating
>"sub" indices increases the size of the index files by an order of
>magnitude. My directory is quite large -- about 1,000,000 entries --
>so I have to make sure that the machine running LDAP can handle the
>memory requirements, which are in part determined by options like
>dbcachesize.
>Does anybody have some rough metrics for memory usage, e.g. with X
>number of entries, Y number of attributes per entry, Z indexes on
>these attributes, etc?

What is the "average" object size?  Are you storing binary information (photos,
etc..) in the directory?  I think the backend you use in this case is the most
important component (in addition to sufficient resources).

>One option that I'm considering is to maintain different indices on
>different slave directories. This way, performance on the server used
>by my web application can be optimized for the search filters used by the
>web application, and performance on the server used by the service and
>support staff can degrade a bit more...
>Has anybody else tried this? Or do you have a better suggestion?

I haven't tried this.  Sounds like an interesting idea.

>Finally, is slurpd actually more efficient at handling modify
>operations? While a directory is handling a modify operation from a
>client, its performance is pretty bad. So, by "efficient" I mean, do
>the slaves take significantly LESS of a performance hit if the slurpd
>process sends modify operations than if a slave handles client
>requests directly?

slurpd doesn't perform modify operations.  It mirrors such operations on a
master slapd to the slave slapd.  At least as I understand it.  slapd still is
the one performing the modify op.

Systems and Network Administrator
Morrison Industries
1825 Monroe Ave NW.
Grand Rapids, MI. 49505