[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#8174) mdb_drop(,MAIN_DBI,) leaks pages of named DBs



Full_Name: Hallvard B Furuseth
Version: LMDB_0.9.14
OS: 
URL: 
Submission from: (NULL) (81.191.45.5)
Submitted by: hallvard


mdb_drop(,MAIN_DBI,) leaks pages of named DBs

It does not drop the pages of named DBs, but it does
drop the mainDB pages which refer to these DBs.

3 possible fixes + 1 hybrid (from old IRC chats):

(1) Fail if there are named DBs. The minimal fix.
(2) Reset both metapages. Reject new write txns while old
    readers remain.  Needs MDB_DATA_{VERSION/FORMAT} > 1.
(3) Recursive: Drop named DBs too.
(4) Try (3), but fail when that gets complicated.
    
What to do with open DBIs of named databases:

(A) Fail if there are any.
(B) Close them, like a recursive mdb_drop() would do.
    The user must likely "stop the world" first.
(C) Keep them open but useless, as when another process
    dropped the DB.  All the user can do is close them.

API:

I think all variants but "fail" can be dangerous and/or
possibly surprising, and should all be options - even if
we only implement one choice.  I.e. fail if there are
named DBs/DBIs and no options say what to do about it.

Would that be a new mdb_drop2() function? It's unfortunate
to change the meaning of mdb_drop() param 'del' > 1. OTOH
'del' > 1 has been an error since 0.9.7, so I'm OK with it.

Details:

(2) Reset both metapages:

mdb_drop() merely sets a txn flag so mdb_txn_commit() will
reset the metapages. The flag also prevents most operations,
like MDB_TXN_ERROR does. We can include it in MDB_TXN_BLOCKED
from my "mdb/safer" branch.

An MDB_meta flag then tells mdb_txn_renew0() to re-commit
the metapage if Commit failed to write the 2nd metapage,
and to reject writers when too old readers exist.

mdb_txn_renew0() needs a flags parameter, so it can react
to the new txn flags MDB_NO[META]SYNC.  The caller cannot
set them before mdb_txn_renew0(), since txn == me_txn0.

(3) Recursive drop:

mdb_drop0() must locate and use the up-to-date mt_dbs[]
record when it sees a named DB record in the mainDB.  Either:
- Search mt_dbxs[].  Then drop is O(n*n) unless we make
  mt_dbxs[] a tree/hash first.  OK for most users, but a
  few people have said they use lots of DBs.
- Or mdb_drop() can first save dirty named DBs, like Commit
  does.  Wasteful, but less new code.

To make mdb_drop0() recursive: Factor out the DB-walking code
in mdb_env_cwalk(), at least for non-stale DBIs?
If searching mdb_dbxs[] for DBIs, it can also call mdb_drop().
Except it should not close any DBIs until everything succeeds.

(4) Recursive or fail:

Fail if there are dirty named DB (so mdb_drop0 need not searchA0Amt_dbxs), or
maybe non-DB_STALE ones (so it need not reset DB
records), or if called in a nested txn with DB_NEW DBIs or
stuff like that (because it makes my head hurt to try to
figure it out).  Well, it looks much safer now than before we
had mt_dbiseqs[].  Maybe dirty DBs are the only problem.

(B) Close DBIs:

Simple, but makes drop(MAIN DBI) a "stop the world" operation.

(C) Keep DBIs open:

Reset mt_dbs[] contents and mark the DBIs as DB_STALE, then
attempts to use them will fail.  Probably just a TODO item
until there is a general "drop DB without closing it" option.

Related issues:

There is no way to fully reset the main DB (including flags)
and close it.  I think that should be an option too - someday.
Would need the main DBI to not be so special.  E.g. currently
DB_NEW doesn't apply to it.
Until then, that's just something to think about with the API
for the rest of this stuff.  The cleanest API would be if the
main DBI is just like other DBIs, instead of the default "do
not close it" being different.

IIRC it's possible for the user to overwrite/delete a named
DB as plain data and maybe use plain data as a named DB.  I
have an old branch to catch that, which we were a bit nervious
about but should be fine.