[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LMDB vs FastDB



Am 21.03.2013 21:58, schrieb Howard Chu:
Tobias Oberstein wrote:
Hello,

I have read the - very interesting - performance comparison
http://symas.com/mdb/microbench/

I'd like to ask if someone did benchmark LMDB (and/or the others)
against http://www.garret.ru/fastdb.html

FastDB is an in-memory ACID database that works via shadow paging, and
without a transaction log.

OK, like LMDB it uses shadow root pages. I think the similarity ends there.

Ah. Ok.

It is a relational database with an ASCII query language, while LMDB is
strictly a key/value store. That automatically means for simple get/put
operations LMDB will be orders of magnitude faster (just as it is so
much faster than SQLite3 and SQLite4).

Mmh. The overhead of parsing a "SELECT value FROM kv WHERE key = ?" or executing a prepared version of the former versus a direct "kv.get(key)" is there, sure, but _orders_ of magnitude larger?


It makes the flawed assumption that many main-memory databases do, that
all RAM is the same speed. It uses T-trees as its underlying data
structure. LMDB uses B+trees because time and again research shows that
B+trees are more efficient, even when all of the data fits into main
memory. This is the same mistake the MemSQL guys make, as well as MySQL
NDBCluster.

See these papers, for example:
http://resnet.uoregon.edu/~gurney_j/jmpc/skiplist.html
http://www.vldb.org/dblp/db/conf/vldb/RaoR99.html

The reality is that computer architectures are hierarchical in nature. A
CPU has L1, L2, and possibly L3 cache, each of which is progressively
slower than the previous. Then it gets to local RAM, which is slower
still. Then in a multiprocessor machine, there is NUMA, memory residing
on remote nodes, which is again slower still. B+trees are inherently
hierarchical and naturally suited to the reality of system design. All
other approaches, including the recently popular LSM trees (as used in
e.g. Google LevelDB, SQLite4, and various other recent databases) are
all inherently inferior because of this fundamental fact of computer
architecture.

Makes sense .. main-memory access patterns optimized for locality will have an advantage vs patterns that assume O(1) for any (random) pattern.

Btw: could LMDB be used as a backend of sqlite4? I.e. does LMDB support
ordered access?

From the sqlite4 docs:

"SQLite4 needs to be able to seek to the nearest key for a given probe key, then step through keys in lexicographical order in either the ascending or the descending direction. "

.. where ordering is:

"Keys sort in lexicographical order (as if sorted using the memcmp() library function). When one key is a prefix of another, the shorter key occurs first."


FastDB also appears to use locking, while LMDB is MVCC and readers
require no locks, so even with all of the other disadvantages out of the
way, LMDB will scale better across multiple CPUs.

Ah, cool. This is definitely a very good thing: no locks, and writers don't block readers.


FastDB is also written in C++ which is another inherent disadvantage
compared to LMDB which is pure C.

Yes, though I'm a C++ fan, I agree with this point. E.g., here is a nice Python wrapper

https://github.com/dw/py-lmdb/

which interfaces using Cython and wouldn't be possible if LMDB would be C++.


You could adapt the LevelDB microbenchmarks to test it but ultimately I
believe it would be a waste of time.


Thanks for your detailed answer and sharing information! It seems LMDB deserves much more visibility in the community.

- Tobias