[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#8062) Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()



Full_Name: Gawen Arab
Version: 0.9.14
OS: OSX
URL: http://gawen.me/pub/lmdb_crash/gawen-150223.tar.xz
Submission from: (NULL) (77.151.59.7)


Hello,

I'm currently developing a virtual filesystem whose metadata are stored in a
LMDB 0.9.14 database. This software can currently run over Android, iOS, Linux,
OSX and Windows.

This software is currently used by 50 users and I received 11 crash reports of
SIGSEV raised in LMDB. All reports come from OSX computers.

mdb.c:5382: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()

This comes from this assertion
https://gitorious.org/mdb/mdb/source/2f587ae081d076e3707360c5db086520c219d3ea:libraries/liblmdb/mdb.c#L5382
.

This happens when the software iterates over keys in the database, the same way
mdb_dump does, calling in loop mdb_cursor_get(..., MDB_NEXT).

I managed to get back the LMDB db folder from one user which runs on OSX
64bits.
I performed a mdb_dump of it with my Linux 64bits. mdb_dump crashes the same
way. I tried with the lastest mdb_dump from the mdb.master branch (commit
3368d1f5e243225cba4d730fba19ff600798ebe3), but I have the same assertion.

mdb.c:5599: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()

I attached to this issue the faulty database 'db/'.

The output of mdb_dump on my machine is attached to this issue named as the
file
'mdb_dump.log'. I runned back mdb_dump with MDB_DEBUG defined to 2 and
mdb_debug
true. I attached the output of this new dump as the file 'mdb_dump.debug.log'.
The interesting part:

mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
mdb_cursor_next:5591 ==> cursor points to page 1946 with 50 keys, key index 48
b5f1ff0700000000
0700000000000000bc10000000000000000e000000000000860d000000000000e308000000000000ee07000000000000a5070000000000000307000000000000
mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
mdb_cursor_next:5591 ==> cursor points to page 1946 with 50 keys, key index 49
b6f1ff0700000000
0700000000000000820e000000000000310e000000000000770d000000000000660c000000000000cb0a0000000000002a080000000000000008000000000000
mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
mdb_cursor_next:5579 =====> move to next sibling page mdb_cursor_pop:5095
popped
page 1946 off db 1 cursor 0x1897350 mdb_cursor_sibling:5503 parent page is page
4643, index 16 mdb_cursor_sibling:5521 just moving to right index key 17
mdb_cursor_push:5104 pushing page 704 on db 1 cursor 0x1897350
mdb_cursor_next:5585 next page is 704, key index 0 mdb_cursor_next:5591 ==>
cursor points to page 704 with 7 keys, key index 0 mdb.c:5599: Assertion
'IS_LEAF(mp)' failed in mdb_cursor_next()
%%0
I do not know much about internal LMDB design for now, so I'm struggling to
understand the debug lines...

Unfortunately, I do not know how to reproduce this bug, but it is a recurring
one.

Maybe the following information can help you,

- My software opens the database in MDB_NOSYNC mode, and performs a
  mdb_env_sync(env, 1) every second or less. A mdb_env_sync can happen at the
  same time of any other operation (i.e. mdb_txn_commit). I didn't put any lock
  mechanism. Could this expln n such situation ?

- mapsize is currently 512MiB. Previous versions of the software set to smaller
  values (the database had not been rebuilt since).

I'd be happy to provide more information if you need.

Thank you for your support and this amazing IP.

Regards