[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: FD_SETSIZE problems on a high-end server



Kurt D. Zeilenga wanted us to know:

>This question comes up enough that I added a FAQ answer for it:
>        http://www.openldap.org/faq/index.cgi?file=1126

I don't _think_ this is what I'm up against, but feel free to let me
know if it is in fact what I'm fighting.

I'm seeing this in my logs
  Oct 19 09:17:34 smtp2 sm-mta[8116]: nss_ldap: reconnecting to LDAP server...
  Oct 19 09:17:34 smtp2 sm-mta[8116]: nss_ldap: reconnected to LDAP server after 1 attempt(s)
  Oct 19 09:40:35 smtp2 sm-mta[1787]: nss_ldap: reconnecting to LDAP server...
  Oct 19 09:40:35 smtp2 sm-mta[1787]: nss_ldap: reconnected to LDAP server after 1 attempt(s)
  Oct 19 09:43:37 smtp2 sm-mta[1787]: nss_ldap: reconnecting to LDAP server...
  Oct 19 09:43:37 smtp2 sm-mta[1787]: nss_ldap: reconnected to LDAP server after 1 attempt(s)
  Oct 19 10:26:28 smtp2 sm-mta[21019]: nss_ldap: reconnecting to LDAP server...
  Oct 19 10:26:28 smtp2 sm-mta[21019]: nss_ldap: reconnected to LDAP server after 1 attempt(s)

My smtp boxen are the heaviest loaded at the moment, so they show the
most number of these messages:
  smtp2 root # grep "ldap.*reconnecting" /var/log/maillog | wc -l
  267
That is since yesterday at ~ 4:00 PM.  Before that point, this mail
cluster was handling about half of the mail load for our company.  At
4:00, we switched over DNS and configuration information so that the
cluster was handling 100% of the mail load.  Then these messages started
appearing.  There are no messages in syslog on the ldap machine itself.

I'm averaging around 120 concurrent connections.  This is far below the
1024 limit that the above FAQ item suggests.  I've tried increasing the
max threads up to 32 (from the implicit default of 16) in slapd.conf and
saw no change.
  ldap1 root # lsof | grep "\b7378\b" | wc -l
  171
  ldap1 root # lsof -i tcp | grep "\b7378\b" | wc -l
  121

The ldap machine is not heavily loaded:
  ldap1 root # uptime
   10:27:59 up 48 days, 22:24,  2 users,  load average: 0.11, 0.13, 0.04
  ldap1 root # free
               total       used       free     shared    buffers    cached
  Mem:       1034004    1028208       5796          0     131716    718012
  -/+ buffers/cache:     178480     855524
  Swap:      1004052          0    1004052

Can someone point me in the direction where I need to be looking to fix
this?  It seems that I'm hitting a max limit of incoming connections on
the ldap server.  Could it be outgoing connections on the smtp servers?
haven't given that much thought.
-- 
Regards...		Todd
They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety.       --Benjamin Franklin
Linux kernel 2.6.3-19mdkenterprise   2 users,  load average: 0.00, 0.03, 0.00