[Date Prev][Date Next]
Re: file descriptors
Howard Chu writes:
> Probably a half-baked idea... Since select() works with file
> descriptors starting from 0 going upward, it might be helpful to
> arrange so that syslog and DB file descriptors are opened with high
> descriptor numbers, leaving the low range exclusively for sockets. On
> a server with many backends, or a database with many indices, those
> static descriptors can add up, and it's just wasted effort for select
> to walk through them.
Sounds interesting, but it does make me nervous. It seems to get very
implementation-dependent, as described below. I'll try to be a bit more
constructive at the end.
> e.g., at slapd startup time, before calling openlog(), something like:
> while(( fd = dup(0)) > 0 )
> if ( fd > slap_max_fd )
> slap_max_fd = fd;
Put some upper limit on it. This looks like it could eat a lot of
resources if a process may open a lot of descriptors, or if descriptors
Also, if a process opens a lot of descriptors and then closes a lot of
descriptors, could that slow down future descriptor handling? I don't
have a clue about this, but I imagine an OS could optimize descriptor
handling for processes with less than e.g. (sizeof(int)*CHAR_BIT)
descriptors or for processes where all descriptor numbers are less than
that value. Then invoke slower and more general code once the process
opens more descriptors than that, or a descriptor outside that range.
There is hopefully some minimum number of descriptors available to any
process, but if you exceed that: Are you sure the number of descriptors
available to a process is static during this code, so that close()
immediately makes a descriptor available to it - instead of e.g. making
a descriptor available to the system, so any process could use it up?
Also, unless POSIX or something denies this, the openlog() call might
need several descriptors. It could load a dynamic library and mmap()
it, open a config file (and still have it open when contacting syslogd),
open some lock file, or whatever. So close more than one descriptor
> Likewise during the backend's db_open processing.
> and open all of the DB index files in advance.
Same with this one, though I expect you keep better track of
Sleepycat than you can do with the various openlogs() out there:-)
> Then close all the remaining descriptors before the listener loop
> begins, to make them available for main processing.
I imagine some #ifdefs could handle the above at least for common cases.
But is it select() itself which can waste time, or just slapd's
FD_ISSET() & co? (At the moment I don't see which slapd code this
change would optimize, but I haven't searched too hard.)
Would it help to more generally try to get different types of
descriptors to cluster together? E.g. could select() or an #ifdef in
slapd skip zero bytes in an fd_set, so it would handle the fd_set with
bytes (00, ff) faster than (55, 55)?