[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LMDB random writes really slow for large data

To: Chuntao HONG <chuntao.hong@gmail.com>, openldap-technical@openldap.org
Subject: Re: LMDB random writes really slow for large data
From: Howard Chu <hyc@symas.com>
Date: Fri, 16 Mar 2018 14:55:12 +0000
In-reply-to: <CABC5L6_WsZWwa1LX3a9PAdzjUXcPdXmXk8F2w3OYQvyfhjRTYA@mail.gmail.com>
References: <CABC5L6_WsZWwa1LX3a9PAdzjUXcPdXmXk8F2w3OYQvyfhjRTYA@mail.gmail.com>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0 SeaMonkey/2.53a1

Chuntao HONG wrote:

I am testing LMDB performance with the benchmark given inhttp://www.lmdb.tech/bench/ondisk/. And I noticed that LMDB random writes arereally slow when the data goes beyond memory.
I am using a machine with 4GB DRAM with Intel PCIE SSD. The key size is 10bytes and value size is 1KB. The benchmark code is given inhttp://www.lmdb.tech/bench/ondisk/, and the command line I used is"./db_bench_mdb --benchmarks=fillrandbatch --threads=1 --stats_interval=1024--num=10000000 --value_size=1000 --use_existing_db=0 ".
For the first 1GB of data written, the average write rate is 140MB/s. The ratethen drops significantly to 40MB/s for the first 2GB. At the end of the test,in which 10M values are written, the average rate is just 3MB/s, and theinstant rate is 1MB/s. I know LMDB is not optimized for writes, but I didn'texpect it to be this slow, given that I have a really high-end Intel SSD.

Any flash SSD will get bogged down by a continuous write workload, since itmust do wear-leveling and compaction in the background and "the background" isgetting too busy.

I also notice that the way LMDB access the SSD is really strange. At thebeginning of the test, it writes the SSD at around 400MB/s, but performs noread, which is expected. But as we write more and more data, LMDB starts toread the SSD. As time goes on, the read throughput rises while the writethroughput drops significantly. At the end of test, LMDB is constantly readingat around 190MB/s, while occationally issuing 100MB writes at around 10-20second intervals.
*

*
1. Is it normal for LMDB to have such low write throughput (1MB/s at the endof test) for data stored on SSD?
2. Why is LMDB reading more data than it is writing (about 20MB data read per1MB written) at the end of the test?
**
*
*
To my understanding, although we have more data than the DRAM can hold, thebranch nodes of the B-tree should still be in the DRAM. So for every write,the only pages that we need to fetch from SSD is the leaf nodes. And when wewrite the leaf node, we might also need to write its parents. So there shouldbe more writes than reads. But it turns out LMDB is reading much more thanwriting. I think it might be the reason why it is so slow at the end. But Ireally cannot understand why.*

Rerun the benchmark with --readahead=0. The kernel does 16page readahead bydefault, and on a random access workload, 15 of those pages are wasted effort.They also cause useful pages to be evicted from RAM. This is where themajority of the excess reads come from.


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Follow-Ups:
- Re: LMDB random writes really slow for large data
  - From: Chuntao HONG <chuntao.hong@gmail.com>

References:
- LMDB random writes really slow for large data
  - From: Chuntao HONG <chuntao.hong@gmail.com>

Prev by Date: Missing contextCSN on ldap cluster
Next by Date: Re: new attribute
Index(es):
- Chronological
- Thread