[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3665) Multi-Listener Thread Support

David Boreham wrote:

Howard Chu wrote:

Overall I'm not totally convinced that this is the right place to be dividing the labor. It would make more sense to me to keep the single listener, and hand off each triggered event to the thread pool, so that reading and parsing is

The reason you don't want to do that is the context switch overhead to get the operation from
one thread to another. For optimal performance on an SMP system the same thread should handle
all processing through from reading the operation off the socket to sending the response back
(and that thread should stay on the same CPU because its cache will be hot with that
connection's context).

On average the number of context switches with my suggested approach will be the same. Consider a single CPU system with two listener threads, using the provided patch. At most one of the listener threads can be active; if an event arrives for the queued thread, you get a switch anyway. The relevant listener then reads the socket, parses the data, and pushes the operation into the thread pool - you get a second context switch here. If neither of the listener threads was active (because an operation thread was active) then you get two context switches regardless of the number of listener threads, and on a busy server this is the most likely sequence of events.

With the proposed patch, the only time you save one context switch is if an event arrives for a listener that happens to be the currently active thread already. But this is such an unlikely occurrence (most likely because the system is otherwise idle) that its overall impact on server performance is negligible. The analysis is the same regardless of the number of CPUs - the only time you save anything is when the system is already idle. If the system is busy, as soon as the listener tries to do anything meaningful (i.e., invoke a system call, e.g. to read the socket) it will switch out anyway. So, you've really just created some additional scheduler overhead (and thread stacks are not necessarily cheap) without gaining any throughput.

On the other hand, if we keep just one listener thread, which invokes another thread to do the reading and parsing, we prevent the current event loop code from getting even more complicated than it already is. All of the data structures involved in dispatching the operation are already mutex protected so making the reader/parser multithreaded doesn't impose any new overhead.

 -- Howard Chu
 Chief Architect, Symas Corp.       Director, Highland Sun
 http://www.symas.com               http://highlandsun.com/hyc
 Symas: Premier OpenSource Development and Support