Full_Name: Steven Lang Version: LMDB 0.9.16 OS: Ubuntu 14.04 URL: ftp://ftp.openldap.org/incoming/steven-lang-150929.c Submission from: (NULL) (199.255.44.5) While doing some testing using randomized key and data lengths and random puts and deletes, I found a situation in which mdb_cursor_del(...) followed by mdb_cursor_get(..., MDB_NEXT) causes an error: mdb.c:5726: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next() Aborted After a little debugging I was able to isolate what the cause was and produce a simple program to reproduce the issue; during the re-balance following a delete, it would try to move a node from the first child of the root page to the second. If the node it moves has a key longer than the second node in the root page, and the new key doesn't fit, it will split the root page. As a result, by deleting a node, the tree becomes deeper. However, the cursor only checks and compensates for the tree becoming shallower. So now the cursor was pointing to a branch page rather than a leaf page. Any further cursor operations fail. (Trying to do a MDB_SET, MDB_SET_RANGE, etc will silently fail at this point, while MDB_NEXT and possibly MDB_PREV will assert.) The latest head with the fix for ITS#8221 changes the behavior slightly, due to the different arrangement of the tree with the less aggressive merging. However, it still fails. Attached is a program which fills a DB, then deletes keys until it asserts.
steven.lang@hgst.com wrote: > Full_Name: Steven Lang > Version: LMDB 0.9.16 > OS: Ubuntu 14.04 > URL: ftp://ftp.openldap.org/incoming/steven-lang-150929.c > Submission from: (NULL) (199.255.44.5) > > > While doing some testing using randomized key and data lengths and random puts > and deletes, I found a situation in which mdb_cursor_del(...) followed by > mdb_cursor_get(..., MDB_NEXT) causes an error: > > mdb.c:5726: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next() > Aborted > > After a little debugging I was able to isolate what the cause was and produce a > simple program to reproduce the issue; during the re-balance following a delete, > it would try to move a node from the first child of the root page to the second. > If the node it moves has a key longer than the second node in the root page, and > the new key doesn't fit, it will split the root page. As a result, by deleting a > node, the tree becomes deeper. > > However, the cursor only checks and compensates for the tree becoming shallower. > So now the cursor was pointing to a branch page rather than a leaf page. Any > further cursor operations fail. (Trying to do a MDB_SET, MDB_SET_RANGE, etc will > silently fail at this point, while MDB_NEXT and possibly MDB_PREV will assert.) > > The latest head with the fix for ITS#8221 changes the behavior slightly, due to > the different arrangement of the tree with the less aggressive merging. However, > it still fails. > > Attached is a program which fills a DB, then deletes keys until it asserts. Thanks for the report. Our alternate approach to solve #8221 was to check the page fill factor including the (usually omitted) key #0 size; we didn't go that way because it was slightly more processing. But given this bug, it seems that's the only way to go. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
hyc@symas.com wrote: > steven.lang@hgst.com wrote: >> Full_Name: Steven Lang >> Version: LMDB 0.9.16 >> OS: Ubuntu 14.04 >> URL: ftp://ftp.openldap.org/incoming/steven-lang-150929.c >> Submission from: (NULL) (199.255.44.5) >> >> >> While doing some testing using randomized key and data lengths and random puts >> and deletes, I found a situation in which mdb_cursor_del(...) followed by >> mdb_cursor_get(..., MDB_NEXT) causes an error: >> >> mdb.c:5726: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next() >> Aborted >> Attached is a program which fills a DB, then deletes keys until it asserts. > > Thanks for the report. Our alternate approach to solve #8221 was to check the > page fill factor including the (usually omitted) key #0 size; we didn't go > that way because it was slightly more processing. But given this bug, it seems > that's the only way to go. > Fixed now in git, thanks. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
fixed in mdb.master, mdb.RE/0.9
changed notes changed state Open to Test moved from Incoming to Software Bugs
changed state Test to Closed