[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: file descriptors

Howard Chu writes:

> Probably a half-baked idea...  Since select() works with file
> descriptors starting from 0 going upward, it might be helpful to
> arrange so that syslog and DB file descriptors are opened with high
> descriptor numbers, leaving the low range exclusively for sockets.  On
> a server with many backends, or a database with many indices, those
> static descriptors can add up, and it's just wasted effort for select
> to walk through them.

Sounds interesting, but it does make me nervous.  It seems to get very
implementation-dependent, as described below.  I'll try to be a bit more
constructive at the end.

> e.g., at slapd startup time, before calling openlog(), something like:
>     while(( fd = dup(0)) > 0 )
>        if ( fd > slap_max_fd )
>           slap_max_fd = fd;

Put some upper limit on it.  This looks like it could eat a lot of
resources if a process may open a lot of descriptors, or if descriptors
are expensive.

Also, if a process opens a lot of descriptors and then closes a lot of
descriptors, could that slow down future descriptor handling?  I don't
have a clue about this, but I imagine an OS could optimize descriptor
handling for processes with less than e.g. (sizeof(int)*CHAR_BIT)
descriptors or for processes where all descriptor numbers are less than
that value.  Then invoke slower and more general code once the process
opens more descriptors than that, or a descriptor outside that range.

>     close(slap_max_fd--);
>     openlog(...);

There is hopefully some minimum number of descriptors available to any
process, but if you exceed that: Are you sure the number of descriptors
available to a process is static during this code, so that close()
immediately makes a descriptor available to it - instead of e.g. making
a descriptor available to the system, so any process could use it up?

Also, unless POSIX or something denies this, the openlog() call might
need several descriptors.  It could load a dynamic library and mmap()
it, open a config file (and still have it open when contacting syslogd),
open some lock file, or whatever.  So close more than one descriptor

> Likewise during the backend's db_open processing.
>     close(slap_max_fd--);
>     DB->open(...);
> and open all of the DB index files in advance.

Same with this one, though I expect you keep better track of
Sleepycat than you can do with the various openlogs() out there:-)

> Then close all the remaining descriptors before the listener loop
> begins, to make them available for main processing.

I imagine some #ifdefs could handle the above at least for common cases.

But is it select() itself which can waste time, or just slapd's
FD_ISSET() & co?  (At the moment I don't see which slapd code this
change would optimize, but I haven't searched too hard.)

Would it help to more generally try to get different types of
descriptors to cluster together?  E.g. could select() or an #ifdef in
slapd skip zero bytes in an fd_set, so it would handle the fd_set with
bytes (00, ff) faster than (55, 55)?