I wonder if an intermediate variation on this idea might be easier to implement:
We can describe a subset of MDB_RDONLY transactions as 'long running', e.g. when copying the database. Normal read-only transactions might then be called 'ephemeral'. We might compute long-running transactions heuristically, e.g. ((currentTxnID - rdOnlyTxnId) > 100) and/or we might allow developers to mark them.
When deciding which transactions to free, we compute three numbers with a one-pass scan of the reader lock table:
txnO = the oldest active transaction
txnE = the oldest active ephemeral transaction
txnL = the newest active long-running transaction
Normally, LMDB just computes txnO, and conservatively assumes that readers might be operating anywhere between txnO and the current transaction. Emmanuel's proposal for precise analysis is somewhat daunting. But I think we can use this simple three-point analysis to find some sweet spots between precise and conservative.
We know by definition that txnO <= txnL, and txnO <= txnE.
Under normal conditions, i.e. assuming that long-running transactions are relatively infrequent, we also know that txnL < txnE will be true more frequently than not. Conservatively, we must protect the ranges from [txnO,txnL] and [txnE,currentTxnId]. But in the common case, there will be a non-empty range of transactions (txnL,txnE) that can be GC'd. And of course anything below txnO can still be GC'd.
I'm not sure whether or not we can easily identify which pages are used only within that gap, and may thus be collected. I haven't studied that aspect of LMDB yet. But supposing that aspect is addressed, this scheme should greatly mitigate the impact of infrequent long-running read-only transactions. A lot more space from recent write transactions would be reusable.