[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LMDB crash consistency, again

Howard Chu wrote:
Hallvard Breien Furuseth wrote:
On 15/11/14 02:57, Howard Chu wrote:
Ah good point. If you check out their slides, #103 of 106 asks the
question; the only failure they found in LMDB occurred on ext3 (and not
on XFS) so we may just chalk this up to a flaw in ext3 instead.

Given that ext3 has already been superseded by ext4, this result of
theirs may not be all that useful in the real world. We already have
disrecommended ext3 for performance reasons, perhaps we should just note
this and move on.

No, ext4 breaks too.  Their paper's page 459, 5.1.2 LightningDB:

     The fact that the journal commit block (op#402) is
     flushed with the next pwrite64 in the same thread
     means fdatasync on ext3 does not wait for the comple-
     tion of journaling (similar behavior has been observed
     on ext4).

I guess O_DSYNC and fdatasync() should not be the lmdb
defaults yet, at least not on Linux:-(

We need a bug report to Linux or to the distro they were
using, noting that the power faults are simulated.  And is
this O_DSYNC, fdatasync or both?  The problem might be with
only one of them.  I'm not reading the paper in detail yet.

This appears to be quite old news.


It has references going back to at least 2008.

The LKML thread indicates that this bug was already fixed. The zheng mai paper says they used RHEL6, which shipped with kernel 2.6.32 so it apparently was too old to have the fix.

All in all a bunch of bogus reporting; claiming that all DBs are broken when in fact LMDB is perfectly correct.

  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/