[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: MDB v2: Replace meta pages with "meta position" word

Howard Chu writes:
> I don't see this approach reducing seek overhead. It may be able to reduce 
> sync overhead, but only if you accept the possibility of delayed syncs 
> failing. Overall I don't see it as any improvement for ACID compliance.
> What is the real benefit?

I thought I said under results, but maybe I went at it backwards again...

[From the other message]
> Sounds to me like you want to make MDB fully multi-version.

No, I didn't think of that at all.


Default mode offers full ACID if no outside interruptions interfere, so
that can hardly be improved.  MDB with some the speedup flags does not:

In terms of ACID, this is an ACI-safe variant of using MDB_NOSYNC + some
mdb_env_sync()s, or MDB_MAPASYNC.  That improves sync and seek overhead,
but those flags allow a system crash to corrupt the database.

If you are willing to lose transactions but want ACI, MDB supports
syncing once per txn (MDB_NOMETASYNC) instead of twice, but not fewer
that that.  This suggestion allows fewer.

Also syncing after unlock might speed up some programs since the next
write txn can start sooner.  That would be for whoever would use
MDB_MAPASYNC if it were ACI-safe - i.e. sync busily but don't wait.

BTW, I wonder if the MDB_MAPASYNC flag is a good idea.  It's almost as
busy as MDB_NOMETASYNC, yet guaranteees nothing.  I don't know why
anyone would use it.  It might be more useful as an option to
mdb_env_sync(), which a user of MDB_NOSYNC could call now and then.

One thing about ACID - IIRC there are some potential data races which my
single-word meta info in the header will fix.  E.g. if the user does ^Z
at an unfortunate time, and with WRITEMAP updating the meta page while a
txn is reading it.  Could page back in the IRC discussions to check.

> It may be worthwhile. I just want the actual specific advantages spelled out. 
> Other DB systems use delayed/group commit to reduce sync overhead.

Delayed sync, yes.

> It's worth doing, when your application can tolerate that type of
> behavior. But this can't be the default behavior.

Indeed, default mode should be safe.  Or almost-safe - I wonder if
MDB_NOMETASYNC is the best default since it's ACI-safe and will
normally lose nothing.