[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Freeing unused pages between two revisions...

David Barbour wrote:
I wonder if an intermediate variation on this idea might be easier to

We can describe a subset of MDB_RDONLY transactions as 'long running',
e.g. when copying the database. Normal read-only transactions might then
be called 'ephemeral'. We might compute long-running transactions
heuristically, e.g. ((currentTxnID - rdOnlyTxnId) > 100) and/or we might
allow developers to mark them.

Interesting. I think, rather than picking an arbitrary age, we could simply look for the largest gap between two reader txnIDs and work from there. As long as the gap is >2 then there should be some pages to recover.

When deciding which transactions to free, we compute three numbers with
a one-pass scan of the reader lock table:

txnO = the oldest active transaction
txnE = the oldest active ephemeral transaction
txnL = the newest active long-running transaction

Normally, LMDB just computes txnO, and conservatively assumes that
readers might be operating anywhere between txnO and the current
transaction. Emmanuel's proposal for precise analysis is somewhat
daunting. But I think we can use this simple three-point analysis to
find some sweet spots between precise and conservative.

We know by definition that txnO <= txnL, and txnO <= txnE.

Under normal conditions, i.e. assuming that long-running transactions
are relatively infrequent, we also know that txnL < txnE will be true
more frequently than not.  Conservatively, we must protect the ranges
from [txnO,txnL] and [txnE,currentTxnId].  But in the common case, there
will be a non-empty range of transactions (txnL,txnE) that can be GC'd.
And of course anything below txnO can still be GC'd.

I'm not sure whether or not we can easily identify which pages are used
only within that gap, and may thus be collected. I haven't studied that
aspect of LMDB yet. But supposing that aspect is addressed, this scheme
should greatly mitigate the impact of infrequent long-running read-only
transactions. A lot more space from recent write transactions would be

Yes, finding that gap is the tricky part now. I'm guessing we'll need a freeDB format change to have enough info to do it, though maybe not. Currently we record that in txn X, some list of pageIDs was freed. In LMDB 1.0 we will also have per-page txnID stamps, to tell us when a page was written.

Preserving the range from [txnO,txnL] pretty much means we cannot reuse any pages up to txnL's free list. I.e., the page's stamp must be > txnL, it must have been written at a point no txn in [txnO,txnL] could have seen it. In [txnL+1,txnE] we can reuse all the free pages whose stamp is > txnL. Perhaps we don't need a freeDB change.

  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/