[Date Prev][Date Next]
Re: entry_free() etc. bottlenecks
Howard Chu writes:
> The obvious fix is to adopt the same strategies that tcmalloc uses. (And
> unfortunately we can't simply rely on tcmalloc always being available, or
> always being stable in a given environment.)
Good, though I'd like to see these slapd re-implementations of system
features (like malloc) #ifdeffed with a fallback to the system feature.
Then one can compile with -D<revert to system feature> either when that
one is as good or better than slapd's, or to simplify debugging.
Configure can guess about it too, e.g. it can detect tcmalloc.
The new entry_free() plus tcmalloc may be better than plain tcmalloc,
I don't know. It retains the global mutex though, which presumably is
or someday will be a pessimization compared to _some_ malloc out there.
> I.e., use per-thread cached free
> lists. We maintain some small number of free objects per thread; this
> per-thread free list can be used without locking. When the number of free
> objects on a given thread exceeds a particular threshold
...or there is no thread key for the mutex (e.g. when the current
thread is not from the thread pool)...
Might be convenient to let slapd register init-thread and cleanup-thread
functions in the thread pool. These could create/destroy these mutexes,
and maybe some other per-thread slapd variables too.
(Preferably the init function would be able to fail and cause the pool
thread to die, but that'd mess up the pool logic which assumes once a
thread has been created it will be able to handle submitted tasks.
Except slapd often doesn't check for malloc/mutex_init success anyway,
so demanding success would be no worse than what slapd does now.)
> then we obtain the
> global lock to return some number of objects to the global list.
> In practice this threshold can be very small - any given thread typically
> needs no more than 4 entries at a time. (ModDN is the worst case at 3 entries
> locked at once. LDAP TXNs would distort this figure but not in any critical
> fashion.) For attributes the typical usage is much more variable, but any
> number we pick will be an improvement over the current code.
Add a few more for overlays, in particular syncrepl. Otherwise even a
single overlay doing entry_dup() reduces performance.