Issue 8258 - LMDB cursor_del followed by MDB_NEXT can trigger assert crash
Summary: LMDB cursor_del followed by MDB_NEXT can trigger assert crash
Status: VERIFIED FIXED
Alias: None
Product: LMDB
Classification: Unclassified
Component: liblmdb (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-29 23:17 UTC by steven.lang@hgst.com
Modified: 2020-03-12 15:55 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description steven.lang@hgst.com 2015-09-29 23:17:28 UTC
Full_Name: Steven Lang
Version: LMDB 0.9.16
OS: Ubuntu 14.04
URL: ftp://ftp.openldap.org/incoming/steven-lang-150929.c
Submission from: (NULL) (199.255.44.5)


While doing some testing using randomized key and data lengths and random puts
and deletes, I found a situation in which mdb_cursor_del(...) followed by
mdb_cursor_get(..., MDB_NEXT) causes an error:

mdb.c:5726: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
Aborted

After a little debugging I was able to isolate what the cause was and produce a
simple program to reproduce the issue; during the re-balance following a delete,
it would try to move a node from the first child of the root page to the second.
If the node it moves has a key longer than the second node in the root page, and
the new key doesn't fit, it will split the root page. As a result, by deleting a
node, the tree becomes deeper.

However, the cursor only checks and compensates for the tree becoming shallower.
So now the cursor was pointing to a branch page rather than a leaf page. Any
further cursor operations fail. (Trying to do a MDB_SET, MDB_SET_RANGE, etc will
silently fail at this point, while MDB_NEXT and possibly MDB_PREV will assert.)

The latest head with the fix for ITS#8221 changes the behavior slightly, due to
the different arrangement of the tree with the less aggressive merging. However,
it still fails.

Attached is a program which fills a DB, then deletes keys until it asserts.
Comment 1 Howard Chu 2015-09-29 23:29:03 UTC
steven.lang@hgst.com wrote:
> Full_Name: Steven Lang
> Version: LMDB 0.9.16
> OS: Ubuntu 14.04
> URL: ftp://ftp.openldap.org/incoming/steven-lang-150929.c
> Submission from: (NULL) (199.255.44.5)
>
>
> While doing some testing using randomized key and data lengths and random puts
> and deletes, I found a situation in which mdb_cursor_del(...) followed by
> mdb_cursor_get(..., MDB_NEXT) causes an error:
>
> mdb.c:5726: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
> Aborted
>
> After a little debugging I was able to isolate what the cause was and produce a
> simple program to reproduce the issue; during the re-balance following a delete,
> it would try to move a node from the first child of the root page to the second.
> If the node it moves has a key longer than the second node in the root page, and
> the new key doesn't fit, it will split the root page. As a result, by deleting a
> node, the tree becomes deeper.
>
> However, the cursor only checks and compensates for the tree becoming shallower.
> So now the cursor was pointing to a branch page rather than a leaf page. Any
> further cursor operations fail. (Trying to do a MDB_SET, MDB_SET_RANGE, etc will
> silently fail at this point, while MDB_NEXT and possibly MDB_PREV will assert.)
>
> The latest head with the fix for ITS#8221 changes the behavior slightly, due to
> the different arrangement of the tree with the less aggressive merging. However,
> it still fails.
>
> Attached is a program which fills a DB, then deletes keys until it asserts.

Thanks for the report. Our alternate approach to solve #8221 was to check the 
page fill factor including the (usually omitted) key #0 size; we didn't go 
that way because it was slightly more processing. But given this bug, it seems 
that's the only way to go.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Howard Chu 2015-10-04 01:15:51 UTC
hyc@symas.com wrote:
> steven.lang@hgst.com wrote:
>> Full_Name: Steven Lang
>> Version: LMDB 0.9.16
>> OS: Ubuntu 14.04
>> URL: ftp://ftp.openldap.org/incoming/steven-lang-150929.c
>> Submission from: (NULL) (199.255.44.5)
>>
>>
>> While doing some testing using randomized key and data lengths and random puts
>> and deletes, I found a situation in which mdb_cursor_del(...) followed by
>> mdb_cursor_get(..., MDB_NEXT) causes an error:
>>
>> mdb.c:5726: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
>> Aborted

>> Attached is a program which fills a DB, then deletes keys until it asserts.
>
> Thanks for the report. Our alternate approach to solve #8221 was to check the
> page fill factor including the (usually omitted) key #0 size; we didn't go
> that way because it was slightly more processing. But given this bug, it seems
> that's the only way to go.
>
Fixed now in git, thanks.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 3 OpenLDAP project 2015-10-04 01:16:27 UTC
fixed in mdb.master, mdb.RE/0.9
Comment 4 Howard Chu 2015-10-04 01:16:27 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 5 Quanah Gibson-Mount 2015-11-30 18:28:02 UTC
changed state Test to Closed