[Date Prev][Date Next]
Re: write-scaling problems in LMDB
On Mon, Oct 20, 2014 at 1:53 PM, Howard Chu <email@example.com> wrote:
>> then it would be possible to make a direct comparison (against the
>> figures you just sent), against the e.g. 32-threads case. 32 readers,
>> 2 writers. 32 readers, 4 writers. 32 readers, 8 writers and so on.
>> keeping the number of threads (write plus read) to below or equal the
>> total number of cores avoids any unnecessary context-switching
> We can do that by running two instances of the benchmark program
> concurrently; one doing a read-only job with a fixed number of threads (32)
> and one doing a write-only job with the increasing number of threads.
ohh, ok - great. saves a job doing some programming at least.
>> the hypothesis being tested is that the writers performance overall
>> remains the same, as only one may perform writes at a time.
>> i know it sounds silly to do that: it sounds so obvious that yeah it
>> really should not make any difference given that no matter how many
>> writers there are they will always do absolutely nothing (except one
>> of them), and the context switching when one finishes should also be
>> negligeable, but i know there's something wrong and i'd like to help
>> find out what it is.
> My experience from benchmarking OpenLDAP over the years is that mutexes
> scale only up to a point. When you have threads grabbing the same mutex from
> across socket boundaries, things go into the toilet. There's no fix for
> this; that's the nature of inter-socket communication.
argh. ok. so... actually.... accidentally, the design where i used
a single LMDB (one env) shared amongst (20 to 30) processes using
db_open to create (10 or so) databases would mitigate against that...
taking a quick look at mdb.c the mutex lock is done on the env not on
sooo compared to the previous design there would only be a 20/30-to-1
mutex contention whereas previously there were *10 sets* of 20 or 30
to 1 mutexes all competing... and if mutexes use sockets underneath
that would explain why the inter-process communication (which also
used sockets) was so dreadful.
huh, how about that.
do you happen to have access to a straight 8-core SMP system, or is it
relatively easy to turn off the NUMA architecture?