[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: write-scaling problems in LMDB



On Mon, Oct 20, 2014 at 1:00 PM, Howard Chu <hyc@symas.com> wrote:
> Howard Chu wrote:
>>
>> Luke Kenneth Casson Leighton wrote:
>>>
>>> http://symas.com/mdb/inmem/scaling.html
>>>
>>> can i make the suggestion that, whilst i am aware that it is generally
>>> not recommended for production environments to run more processes than
>>> there are cores, you try running 128, 256 and even 512 processes all
>>> hitting that 64-core system, and monitor its I/O usage (iostats) and
>>> loadavg whilst doing so?

>>> the hypothesis to test is that the performance, which should scale
>>> reasonably linearly downwards as a ratio of the number of processes to
>>> the number of cores, instead drops like a lead balloon.
>

> Looks to me like the system was reasonably well behaved.

 and it looks like the writer rate is approximately-halving with each
doubling from 64 onwards.

 ok, so that didn't show anything up... but wait... there's only one
writer, right?  the scenarios where i am seeing difficulties is when
there are multiple writers and readers (actually, multiple writers and
readers to multiple envs simultaneously).

 so to duplicate that scenario, it would either be necessary to modify
the benchmark to do multiple writer threads (knowing that they are
going to have contention, but that's ok) or, to be closer to the
scenario where i have observed difficulties to run the test several
times *simultaneously* on the same database.

 *thinks*.... actually in order to ensure that the reads and writes
are approximately balanced, it would likely be necessary to modify the
benchmark code to allow multiple writer threads and distribute the
workload amongst them whilst at the same time keeping the number of
reader threads the same as it was previously.

 then it would be possible to make a direct comparison (against the
figures you just sent), against the e.g. 32-threads case.  32 readers,
2 writers.  32 readers, 4 writers.  32 readers, 8 writers and so on.
keeping the number of threads (write plus read) to below or equal the
total number of cores avoids any unnecessary context-switching

 the hypothesis being tested is that the writers performance overall
remains the same, as only one may perform writes at a time.

 i know it sounds silly to do that: it sounds so obvious that yeah it
really should not make any difference given that no matter how many
writers there are they will always do absolutely nothing (except one
of them), and the context switching when one finishes should also be
negligeable, but i know there's something wrong and i'd like to help
find out what it is.

l.