[Date Prev][Date Next] [Chronological] [Thread] [Top]

ideas for improving slapd



[reposted]

Hi,

i have some ideas for improving slapd performance.

1.) Use thread specific memory pools for dynamic memory allocation.
There will be an implementation of `malloc' (`pool malloc') operating
on a pool allocated by standard malloc at the start of the thread
and expanded by additional calls to standard malloc as needed. The
pool is reclaimed at the end of the thread, either by invoking
standard free or by adding it to a global `pool pool' for reuse by
the next thread.
- No more memory leaks, because all dynamic memory is summarily
  reclaimed at the end of the thread.
- Standard malloc and free are fairly complex and especially
  call functions for acquisition and release of mutexes in a MT-
  environment which degrades performance. The `pool malloc' can
  avoid this, because the pool is exclusively used by one thread,
  no global data is involved.
- Actually it doesn't appear to be useful to implement a `pool free'
  at all, because most dynamically allocated data is needed well
  until the end of the thread anyway. Additional benefits:
  o Even more lightweight implementation of `pool malloc', no
    search for gaps, no free lists.
  o No calls to `free' and structure destructors at all, much less
    clutter especially at error returns.
  o No need to strdup string constants just because the pointer
    will be freed in some structure destructor later. No need to
    strdup or copy a lot of other stuff just because the pointer
    might be freed twice.
- Data related to one thread resides mostly in a pool of continuous
  memory instead have been spread all over the adress space like
  standard malloc is likely to do in an MT-environment. This might
  improve VM-utilisation and therefore performance due to less
  paging an less workload for the VM-manager.
The biggest problem is probably how to pass the pool addres (or the
adress of the pool management structure) down to the points where
the work will be done. One possibility is to allocate a thread
specific 'global' variable, however this requires a function call
whereever it is accessed. I'd favor to abuse the `Operation *op'
parameter passed to almost all functions and extend the Operation
structure apropriately.
Do you think it would be a good idea to implement? I might be
interested to do it ..... ooooh noooo, never volunteer....it's
too late, i've almost committed myself....However, don't expect
it anytime soon, i have a full time job.

2.) There are quite some instances of code similar to the following
naive strdup implementation, usually somewhat more complicated:

    foo=malloc(strlen(bar)+1);
    strcpy(foo,bar);

A while ago while implementing an EXPRESS compiler which naturally
deals with quite some strdups i found that replacing my naive strdup
implemented in the style above by the following code improved
performance by nearly 10%:

   int len=strlen(bar)+1;
   foo=malloc(len);
   memcpy(foo,bar,len);

This is most likely because strcpy at first scans the string for the
terminating zero and then uses a blockmove with known length for
the copy, which results in an extra scan of the string. I learned to
keep track of string lengths and to avoid almost always strcpy,
strcat and many strcmps, replacing them by memcpy (or bcopy for BSD
fans) and memcmp, and of course i used `sizeof' for string constants.
This proved very useful in an reimplementation of a real time data
processing program.
Putting length fields into the structures shouldn't be that hard,
though it will be a major overhoul of the whole system if applied
consequently.

Enough for today...

Regards

J.Pietschmann