[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: connection timeout?



At 09:25 AM 2001-10-08, Jure Pecar wrote:
>Hi list,
>
>we have an environment with ~230 client machines (linux) and a couple of
>servers (linux, solaris) that are querying 4 ldap replicas (2 linux, 2
>solaris) under dns round robin. We have a lot of problems because we're
>constantly hitting 1024 fds limit of select(). I'm sure that our
>environment does not produce more than 4000 connections to ldap at the
>same time.

If you adjust the number of descriptors per process to a number
greater than FD_SETSIZE, you need to adjust FD_SETSIZE accordingly.

env CPPFLAGS="-DFD_SETSIZE=4096" ./configure

>I found out something interesting: lsof showed me that certain connections
>live something like forever on linux replicas, but not on solaris
>replicas. Ie, a certain client has 8 established connections on port 389
>and 3 in time_wait, but at the same time there's 17 connections from this
>same client on one linux replica server, 15 on another, but only 1 on both
>solaris replicas. Why these dead connections dont timeout or something?

Maybe they aren't dead.   OpenLDAP uses TCP keepalives to kill dead
streams.  OpenLDAP also supports killing idle clients, see
slapd.conf(5).

>Could there be some interaction with some linux kernel syscall?
>
>So, what's happening is that both linux replicas get bogged down, troubles
>begin, then both solaris replicas get more and more connections, which
>means more troubles ... our short term solution is to move all replicas to
>solaris, until this problem is resolved, altough openldap on solaris is
>noticeably slower than on linux.
>
>And, is there really no way to dump the select() and its 1024 file
>descriptors limit and use something else instead? poll() maybe?

select() doesn't have a 1024 descriptor limit.  You just need to
adjust FD_SETSIZE to be greater than or equal to the maximum
number of descriptors.