[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: better malloc strategies?

--On Monday, August 28, 2006 5:49 AM -0700 Howard Chu <hyc@symas.com> wrote:

This approach helped a fair amount. Pre-allocating large chunks of memory
to divvy up into the Entry and Attribute lists eliminates the per-alloc
malloc library overhead for these structures. Since the glibc's malloc
performance decreases as the number of allocated objects increases, this
turns out to be an important win. But over the course of hundreds of
runs, the slapd process size continues to grow. Of course things as
innocuous as syslog() also contribute to the problem, as they malloc
stdio buffers for formatting their messages.

One downside is that right now it's a very simple-minded list with a
single mutex protecting the list head. So while malloc may have some
measure of thread scalability, this approach doesn't really. I guess the
saving grace here is that allocs and frees are extremely simple, so the
locks won't be held for long.

The simplicity of the code has helped boost performance a few percent. It
remains to be seen whether this will scale beyond more than a few CPUs.

Another alternative that looks very promising is to use Sun's libumem,
which has been ported to Linux and Windows here
http://sourceforge.net/projects/umem/ . Unfortunately the code there is
not packaged and ready-to-use. It has some autoconf machinery but none of
it bootstraps cleanly, it takes a lot of manual intervention to even get
automake thru it. But the fair amount of hacking that's required appears
to be worth it; the library seems to suffer no degradation thru
continuous querying over long periods of time. Now if only it didn't rely
on so many deep-system and CPU-dependent features, porting to anything
non-x86 will be a pain.

Comparing what the authors have accomplished here with the goals Jong had
for zone-malloc, it's very tempting to  think about adopting the library
and using the umem-specific APIs for managing our object caches. But
given the porting issues I guess it's not realistic to consider that any
time soon.

Well, for a minimum to start, we could add a --with-umem flag to configure, and then build liblber with it as a replacement for the standard memory allocator if it is found, right? ;) Because the improvements we got were enormous.


Quanah Gibson-Mount
Principal Software Developer
ITS/Shared Application Services
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html