[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7825) LMDB: misleading error message

On Sat, 2014-03-22 at 22:49 +0000, timur.kristof@gmail.com wrote:
> When you operate on a dbi using a cursor and accidentally close the dbi before
> committing the transaction of the cursor, mdb_txn_commit will return
> MDB_BAD_VALSIZE. This is quite misleading because the problem doesn't actually
> have anything to do with what the error message suggests. ("Too big key/data,
> key is empty, or wrong DUPFIXED size")
> I suggest to either add a new kind of error code for this situation or to extend
> the description of MDB_BAD_VALSIZE.

Let's see -- here are the options I can see (after an IRC chat):

1) Extending the MDB_BAD_VALSIZE description. Simple. However:

This is documented as a user error, and it's one which lmdb
won't always catch.  If you also called mdb_dbi_open() and the
DBI got reused, lmdb silently updates the wrong named database.

So I'm a bit wary about both of these fixes: Documenting this
case can be misleading if it makes users expect it to be caught.

Unless people know better - it corresponds to EBADF for using a
closed file descriptor, not caught if open() reused it.

2) Catch this case and instead return a generic "something is
And maybe assert(did not happen) to teach users to avoid this.

Needs to be done where md_name is used: Commit/drop(named DB),
page_search(stale named DB), cursor_touch(clean named DB).  Or
pass F_SUBDATA inwards and tweak the "return MDB_BAD_VALSIZE"
statements to detect this.

After playing around a bit, I guess this is my favorite.

3) Leave the misleading message alone, which can send the user
off to a wild goose chase (as described on IRC).

Or we can add code to also catch close followed by open:

4) Put hash(DB name) in MDB_db.md_pad and verify it before
overwriting a named DB.  Costs execution time even when there's
no error.  Must be done in commit, drop and maybe cursor_touch.
(I've pushed a demo patch for just commit.)

5) Add field MDB_dbx.md_dbiseq = DBI usage sequence number,
incremented when dbi_open creates (not reuses) a DBI and in
dbi_close.  Copy it to a new malloced array txn->mt_dbiseqs[]
in write txns, verify it in at least commit and drop.
This means close()-open(same DB) will also be caught, which
seems a good thing since with (4) it will work unreliably:
Only if open() reuses the same DBI for the same DB.