[Date Prev][Date Next] [Chronological] [Thread] [Top]

Thread-local malloc discussion summary






The recent Howard's thread-local malloc commit is based on
a discussion thread on OpenLDAP locking and SMP performance issues.
This is to briefly summarize the thread for further discussion in the
community.

>From the execution profile of slapd, it has been shown that
the locking primitives of libpthread occupy around 12% of the total
execution time. This number is in general considered high and
it is a sign of high locking contention.
A recent SMP performance analysis has also convinced that there is a
high locking contention. The scalability of slapd was shown
to be less than two on an 8-way SMP box.

An early consensus among us (Howard/Jong/Kurt) was that the locking
overhead would mainly come from the connection and thread management
subsystem. Through further analysis, however, it turned out that the
high lock contention comes from the contention in the shared heap
management of malloc. Because most of the mallocs are actually
allocating thread private data objects, this locking contention
can be avoided if we provide the thread-local memory allocator.

Suggested design options here are :
1) a general per-thread-heap malloc,
2) a per-thread / per-object preallocation
   for a small number of the most frequently allocated objects,
3) a special per-thread preallocation for an operation's lifetime
   where mallocs for selected objects of the operation are satisfied
   from the preallocated chunk. Free of such objects can be a no-op.
   When the opeartion finishes, the preallocated chunk can either be
   reset for use in the next operation or be freed.

The current CVS HEAD is implementing the last approach
for the search operation. The most frequently executed malloc
invocations are being investigated for a possible change
to use the new thread-local malloc routine. For operations
such as add / modify (also psearch), the amount of
memory allocated during the operation may grow large when
the size of entry / modlist is unbounded.

Design issues on the add/modify operations, the choice of objects,
API changes requires further discussion in the community.

------------------------
Jong Hyuk Choi
IBM Thomas J. Watson Research Center - Enterprise Linux Group
P. O. Box 218, Yorktown Heights, NY 10598
email: jongchoi@us.ibm.com
(phone) 914-945-3979    (fax) 914-945-4425   TL: 862-3979