[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: slapd lightweight dispatcher
Quanah Gibson-Mount wrote:
I've done a lot of testing in the last several days on a binary with
the lightweight dispatcher code enabled (2.3 base), and I just want to
say I'm a very big fan of it. It allows slapd to scale under high
load at a rate I've not previously seen. I've also played with the
multi-conn flag as well, and haven't seen it particularly affect the
rates when only it is enabled, but I haven't tried the lightweight
dispatcher without it being enabled at the same time.
Some interesting observations from these tests... With the generic
2.3.21 code, throwing 26 SLAMD clients at a SunFire T2000 with a 1GHz
processor, we get around 6400 authentications per second on a 1 million
entry database. (Which is pretty respectable, really. SunOne only got
6233 for the same configuration)
http://blogs.sun.com/roller/page/DirectoryManager/20060110
What's interesting is that we only see about 19% CPU usage on slapd (13%
user, 6% kernel), even though there are 32 virtual processors here and
16 worker threads (LWPs) and the database is 100% cached in RAM. Clearly
slapd isn't making as good a use of the available CPUs as it could; if
it were keeping the threads busy we should see right about 50% CPU usage.
With the lightweight dispatcher and multi-conn array code, we see CPU
utilization go up to 45% (20% user, 25% kernel) and throughput goes up
to about 8800 authentications per second. With the lightweight
dispatcher and the new code with no connections_mutex we get about 8900
authentications per second, and CPU use is at 49% (23% user, 26% kernel).
(We didn't collect numbers for just the lightweight dispatcher by itself
yet.) At this point it seems we've eliminated a bottleneck in the
frontend, only to find that there's probably a bottleneck in the backend
as well. (Otherwise, performance ought to keep improving as we add
worker threads, but it doesn't.) I'm puzzled about why we're seeing so
much more kernel time vs user time. Hopefully spending some time with
dtrace will tell us what's going on there. Ideally we should see a tiny
percentage of kernel time, and we should see performance continue to
improve as more worker threads are added. (Why stop at just 42% faster
than The Other Guys, when we can potentially be 100% faster?)
It's worth noting that this SLAMD authentication job consists of a
search for a uid, followed by a Simple Bind on the returned entry's DN,
so it's really the equivalent of almost 18,000 searches per second. And
since we processed 6400 authentications/second using only 19% of the
CPU, we really ought to be able to get 12,800 auths/second at 40%, and
so on. (OK, it won't scale perfectly linearly because slapd only has one
dispatcher thread, and at some point we will probably saturate the
network interfaces on the server, or the gigabit switch. Also, we're
struggling to get any more requests out of the clients. I really wish
the SLAMD clients were written in C with libldap, this Java stuff is way
too slow. It shouldn't take 26 clients to generate this workload; it
should take at most 16 clients to keep the 16 server threads busy. Who
ever thought that writing performance-critical code in Java was a good
idea?)
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/