[Date Prev][Date Next]
Re: file descriptors
Hallvard B Furuseth wrote:
e.g., at slapd startup time, before calling openlog(), something like:
while(( fd = dup(0)) > 0 )
if ( fd > slap_max_fd )
slap_max_fd = fd;
Put some upper limit on it. This looks like it could eat a lot of
resources if a process may open a lot of descriptors, or if descriptors
Sure, we already have the dtblsize global var for the upper end. Also
this discussion is only meaningful on Unix systems with traditional
select(). (E.g., winsock select() doesn't care.)
Also, if a process opens a lot of descriptors and then closes a lot of
descriptors, could that slow down future descriptor handling? I don't
have a clue about this, but I imagine an OS could optimize descriptor
handling for processes with less than e.g. (sizeof(int)*CHAR_BIT)
descriptors or for processes where all descriptor numbers are less than
that value. Then invoke slower and more general code once the process
opens more descriptors than that, or a descriptor outside that range.
I've never seen it in any BSD or SysV derived kernels, but anything's
There is hopefully some minimum number of descriptors available to any
process, but if you exceed that: Are you sure the number of descriptors
available to a process is static during this code, so that close()
immediately makes a descriptor available to it - instead of e.g. making
a descriptor available to the system, so any process could use it up?
File descriptors are strictly a per-process resource. Files may be a
system-wide resource, but since dup() only increments a reference count
on an open file, the total number of file resources in the system isn't
Also, unless POSIX or something denies this, the openlog() call might
need several descriptors. It could load a dynamic library and mmap()
it, open a config file (and still have it open when contacting syslogd),
open some lock file, or whatever. So close more than one descriptor
That's fine. We could simply allocate up to dtblsize-100 or some
arbitrary number as a starting point.
Then close all the remaining descriptors before the listener loop
begins, to make them available for main processing.
I imagine some #ifdefs could handle the above at least for common cases.
But is it select() itself which can waste time, or just slapd's
FD_ISSET() & co? (At the moment I don't see which slapd code this
change would optimize, but I haven't searched too hard.)
FD_SET etc. is relatively constant time, since it just requires a mask
and shift. But the select() call has to iterate (linearly) from zero to
the highest numbered descriptor in use when polling for events to
return. So it's advantageous to keep the interesting descriptors in low
In general I don't think this is a significant concern. I was thinking
about it before and dismissed it, but recently we were examining a
mysterious bug (ITS#4159) and we noticed that the slapd process was
getting EMFILE and had run out of descriptors. It turns out the nfiles
ulimit on this process was only set at 256, and there were about 85
indexed attributes configured. So select was being called with nfds=256
all the time, while more than 1/3rd of those descriptors were uninteresting.
Would it help to more generally try to get different types of
descriptors to cluster together? E.g. could select() or an #ifdef in
slapd skip zero bytes in an fd_set, so it would handle the fd_set with
bytes (00, ff) faster than (55, 55)?
I don't believe there's any way to maintain this kind of clustering.
Data connections come and go, and we enable/disable selecting on them
continually.At best we could reserve the first 8 descriptors so that
that byte is always zero.
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/