Re: (ITS#8062) Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()

g@wenarab.com wrote:
> Full_Name: Gawen Arab
> Version: 0.9.14
> URL: http://gawen.me/pub/lmdb_crash/gawen-150223.tar.xz
> Submission from: (NULL) (
> Hello,
> I'm currently developing a virtual filesystem whose metadata are stored in a
> LMDB 0.9.14 database. This software can currently run over Android, iOS, Linux,
> OSX and Windows.
> This software is currently used by 50 users and I received 11 crash reports of
> SIGSEV raised in LMDB. All reports come from OSX computers.

Thanks for the report. Unfortunately, it appears the DB was constructed incorrectly at some time in the past, so even updating to later revisions of the code won't avoid the crash. It's more important now to find the sequence of steps that recreates the invalid DB in the first place, or verify that the same sequence doesn't crash using the 0.9.15 release candidate.

> mdb.c:5382: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
> This comes from this assertion
> https://gitorious.org/mdb/mdb/source/2f587ae081d076e3707360c5db086520c219d3ea:libraries/liblmdb/mdb.c#L5382
> This happens when the software iterates over keys in the database, the same way
> mdb_dump does, calling in loop mdb_cursor_get(..., MDB_NEXT).
> I managed to get back the LMDB db folder from one user which runs on OSX
> 64bits.
> I performed a mdb_dump of it with my Linux 64bits. mdb_dump crashes the same
> way. I tried with the lastest mdb_dump from the mdb.master branch (commit
> 3368d1f5e243225cba4d730fba19ff600798ebe3), but I have the same assertion.
> mdb.c:5599: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
> I attached to this issue the faulty database 'db/'.
> The output of mdb_dump on my machine is attached to this issue named as the
> file
> 'mdb_dump.log'. I runned back mdb_dump with MDB_DEBUG defined to 2 and
> mdb_debug
> true. I attached the output of this new dump as the file 'mdb_dump.debug.log'.
> The interesting part:
> mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
> mdb_cursor_next:5591 ==> cursor points to page 1946 with 50 keys, key index 48
> b5f1ff0700000000
> 0700000000000000bc10000000000000000e000000000000860d000000000000e308000000000000ee07000000000000a5070000000000000307000000000000
> mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
> mdb_cursor_next:5591 ==> cursor points to page 1946 with 50 keys, key index 49
> b6f1ff0700000000
> 0700000000000000820e000000000000310e000000000000770d000000000000660c000000000000cb0a0000000000002a080000000000000008000000000000
> mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
> mdb_cursor_next:5579 =====> move to next sibling page mdb_cursor_pop:5095
> popped
> page 1946 off db 1 cursor 0x1897350 mdb_cursor_sibling:5503 parent page is page
> 4643, index 16 mdb_cursor_sibling:5521 just moving to right index key 17
> mdb_cursor_push:5104 pushing page 704 on db 1 cursor 0x1897350
> mdb_cursor_next:5585 next page is 704, key index 0 mdb_cursor_next:5591 ==>
> cursor points to page 704 with 7 keys, key index 0 mdb.c:5599: Assertion
> 'IS_LEAF(mp)' failed in mdb_cursor_next()
> I do not know much about internal LMDB design for now, so I'm struggling to
> understand the debug lines...
> Unfortunately, I do not know how to reproduce this bug, but it is a recurring
> one.
> Maybe the following information can help you,
> - My software opens the database in MDB_NOSYNC mode, and performs a
>    mdb_env_sync(env, 1) every second or less. A mdb_env_sync can happen at the
>    same time of any other operation (i.e. mdb_txn_commit). I didn't put any lock
>    mechanism. Could this expln n such situation ?
> - mapsize is currently 512MiB. Previous versions of the software set to smaller
>    values (the database had not been rebuilt since).
> I'd be happy to provide more information if you need.
> Thank you for your support and this amazing IP.
> Regards

