[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: MDB v2: Replace meta pages with "meta position" word



Hallvard Breien Furuseth wrote:
I think MDB v2 should move the variable parts of MDB_meta into the
data pages.  The datafile header would retain a word with the position
of the last *synced* MDB_meta, or of the last meta when MDB_NOSYNC.
The lockfile header would hold the position of the *last* MDB_meta.

Sounds to me like you want to make MDB fully multi-version. I don't see any benefit for OpenLDAP/slapd in doing that.

If that's not what you're trying to do, then you need to specify the algorithm for allocating meta pages such that versions don't accumulate endlessly.

All transactions start from the lockfile->metapos commit.  Write txns
do not reuse free pages younger than the datafile->metapos commit.

mdb_env_sync() called by the user does roughly:
   size_t lastpos = lockfile->metapos;
   sync;
# define pos2id(env, pos) ((MDB_meta*)((env)->me_map+(pos)))->mt_txnid
   if (pos2id(env, lastpos) > pos2id(env, datafile->metapos))
     write lastpos to &datafile->metapos;
Called from mdb_txn_commit(), this may need lastpos as an argument.


Results, if I'm keeping this straight:

Setting the latest commit becomes atomic: Just change metapos.
(Field MDB_txninfo.mti_txnid goes away.)

The latest commit is already atomic. mti_txnid is updated atomically in the current code.

No sync issues with copying 'MDB_db's from the meta, since the meta
will not be overwritten during the txn.

There are no sync issues in the current code.

Users can sync infrequently yet preserve consistency, a generalization
of MDB_NOMETASYNC.  An application crash will then lose unsynced
commits, since resetting the lockfile must reset lockfile->metapos.
MDB cannot know if a system crash left those commits unsynced.

OK, preserving consistency is potentially a win vs what we have now. But it's also more of a crapshoot - you're only providing ACI, not D, and the application won't hear about the loss of D until long after the fact.

Some applications can probably tolerate this. But is it something we want to deal with?

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/