[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: performance issue behind a a load balancer 2.3.32

This is a large production environment (several hundred servers,
thousands of requests per minute) and the F5-LB is used to balance the
load and take care if a node needs to be taken out of service for maint
for any reason. With RR DNS if a server is slow (for whatever reason
,backups ,etc) the F5 notices that and adjusts the connections
distribution as needed,  RR DNS can't do that.  As far as indexes,  the
environment has been performing extremelywell until recently after a
few m=hundred thousand more users were added as well as signifiantly
higher activity, at which point we began seeing issues when behind the
loadbalancers during peak times of day.  The LB vender says the issue
is with with openldap, and those settings,
conn_max_pending/conn_max_pending_auth were the only ones that seemed
to stick out, though the documentation on those is rather ambiguous.

 -- David J. Andruczyk

----- Original Message ----
From: Sean O'Malley <omalleys@msu.edu>
To: David J. Andruczyk <djandruczyk@yahoo.com>
Cc: openldap-software@openldap.org
Sent: Tuesday, July 21, 2009 2:24:58 PM
Subject: Re: performance issue behind a a load balancer 2.3.32

Why bother with the load balancer? I am curious, I am sure there is a
reason, but it isn't making a lot of sense to me. You can either do round
robin dns, or just pass out the 3 read server addy's to the clients for
failover (and change the order for real poor mans load balancing.)

conn_max_pending is what I had to adjust to up the connections, but i
suspect you may have indexing issues, returning too many responses, etc.

On Tue, 21 Jul 2009, David J. Andruczyk wrote:

> I work at a place with a fairly large openldap setup. (2.3.32).  We
> have 3 large Read servers, Dell R900, 32 GB ram, 2x quadcore, hw RAID1
> disks for ldap volume. The entire database takes about 13GB of
> physical disk space (the BDB files), and has a few million entries.
> DB_CONFIG is configurd to have the entire DB in memory (for speed),
> and slapd.conf cachesize is set to a million entries, to make as most
> effective use of the 32GB of ram this box has.  We have them behind an
> F5 BigIP hardware load balancer (6400 series), and find that during
> peak times of the day, we get "connection deferred: binding" in our
> slapd.logs. (loglevel set to "none" (misnomer)), and a client request
> (or series of them) fails.  If we use round robin DNS instead, we
> rarely see those errors.  CPU usage is low, even during peak times,
> hovering at 20-50% of 1 core (the other 7 are idle)
> The interesting this are it seems to happen (the connection defered:
> binding) , only after a certian load threshold is reached (busiest
> time of day), and only when behind the F5's.  I am suspected it might
> be the "conn_max_pending" or "conn_max_pending_auth" defaults (100 and
> 1000 respectively), as when behind the F5, all the connections will
> appear to come from the F5 addresses, vs RR dns where it's coming from
> a wde range of sources (eah of the servers. (well over 100+).
> We had tried experimenting with a higher number of threads previously,  but that didn't seem to have a positive effect.    Can any openLDAP guru's suggest some things to set/look for,  i.e. (higher number of threads, higher defaults for conn_max_pending, conn_max_pending_auth).
> Any ideas on what a theoretical performance limit should be of a machine of this caliber? i.e. how many reqs/sec, how far will it scale, etc..
> We have plans to upgrade to 2.4,  but it's a "down the road item", and mgmt is demanding answers to "how far can this design scale as it is"...
> Thanks!
> -- David J. Andruczyk

  Sean O'Malley, Information Technologist
  Michigan State University