[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3665) Multi-Listener Thread Support



Emmanuel Lécharny wrote:
   On 8/2/10 5:26 AM, Howard Chu wrote:
<snip/>
Why would you have more than one select() ? Wouldn't it be better to
have one thread processing the select() and dispatching the operation to
a pool of threads ?

That's what we have right now, and as far as I can see it's a
bottleneck that prevents us from utilizing more than 12 cores. (I
could be wrong, and the bottleneck might actually be in the thread
pool manager. I haven't got precise enough measurements yet to know
for sure.)

Here's the situation: suppose you have thousands of clients connected
and active. Even if you have CPUs to spare, the number of connections
you can acknowledge and dispatch is limited by the speed of the single
thread that's processing select(). Even if all it does is walk thru
the list of active descriptors and dispatch a job to the thread pool
for each one, it's only possible to dispatch a fixed number of
ops/second, no matter how many other CPUs there are.

I'm a bit surprised that the select() processing *is* the bottleneck...
All in all, it's just -internally- a matter of processing a bit field to
see which bit is set to 1, and then get back the FD that is associated
with this bit. You must have some other tasks running that create this
bottleneck.

I will have to check OpenLDAP code here...

Right, it only amounts to around a 10% difference. Still, this is an improvement. I guess we'll need to setup oprofile and get some more detailed numbers to find out where any remaining bottlenecks are.

Right now on a 24 core server I'm seeing 48,000 searches/second and
50% CPU utilization. Adding more clients only seems to increase the
overall latency, but CPU usage and throughput don't increase any further.
Have you tried to do something we did on ADS : remove all the processing
to simply have a mock LDAP server, where only the network part is studied ?

ie, we just send back a mock response when a request has been received.

It helps to focus only on the network layer.

Yes, we have back-null for that purpose.

(sadly, as the PDU decoding is a costly operation, you may have to take
that into account).

Sure, anything that's in the frontend is going to be included, but that's OK, there's no synchronization overhead in the decoders.
--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/