Issue 8264 - LMDB mdb_del in a cursor causes data loss in certain situations
Summary: LMDB mdb_del in a cursor causes data loss in certain situations
Status: VERIFIED FIXED
Alias: None
Product: LMDB
Classification: Unclassified
Component: liblmdb (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-06 00:03 UTC by malyn@strangegizmo.com
Modified: 2020-03-12 15:55 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description malyn@strangegizmo.com 2015-10-06 00:03:40 UTC
Full_Name: Michael Alyn Miller
Version: LMDB 0.9.14, Git head
OS: Windows 8.1 x64, NixOS 14.12 x64
URL: ftp://ftp.openldap.org/incoming/michael-alyn-miller-151005.c
Submission from: (NULL) (96.251.78.237)


I have a helper library that sits on top of LMDB and testing some of the
abstractions in that library has exposed a bug in LMDB related to deleting items
while inside of a cursor.  I have reduced the repro down to a test case that
operates directly against LMDB.

The test in question adds three groups of keys to the database (each with a
different prefix) and then deletes the third group of keys.  The deletion is
done by scanning all of the keys with that prefix using a cursor. 
mdb_cursor_del works correctly and deletes only the targeted keys, whereas
mdb_del causes nine extra keys (not in that group/prefix) to be deleted.

Note that most of my test runs (with different test data) do not exhibit this
behavior: the targeted keys, and only the targeted keys, are deleted as
expected.  Something about these specific keys and values triggers the bug.  I
assume that other keys will also trigger the bug, but these are the ones that I
have converted into an LMDB-based test.

The referenced test runs twice: once with mdb_cursor_del and once with mdb_del. 
You can mdb_dump the data before and after each deletion to see the extra keys
that are being removed.  The incorrectly deleted keys are all consecutive and
begin with 02000000000000003a; the nine keys after and including 02...3aa4...
are the ones that are incorrectly deleted.  Only keys beginning with
03000000000000003a should have been deleted (so it is interesting to note that
the nine keys immediately preceding the targeted keys were the ones that were
incorrectly deleted).

Is it valid to delete keys with mdb_del while inside of a cursor?  If not, then
ignore this bug report and I will modify my helper library.  That said, I do
expose more than just delete to my callers during cursor scans (they can mdb_put
new items during a scan), which is why my library uses mdb_del and not
mdb_cursor_del in this scenario.

I have tested this code against LMDB 0.9.14 and Git head 8b46dcc.  Please let me
know if you need any additional information.  Thank you for your time!
Comment 1 Howard Chu 2015-10-06 04:44:46 UTC
malyn@strangeGizmo.com wrote:
> Full_Name: Michael Alyn Miller
> Version: LMDB 0.9.14, Git head
> OS: Windows 8.1 x64, NixOS 14.12 x64
> URL: ftp://ftp.openldap.org/incoming/michael-alyn-miller-151005.c
> Submission from: (NULL) (96.251.78.237)
>
>
> I have a helper library that sits on top of LMDB and testing some of the
> abstractions in that library has exposed a bug in LMDB related to deleting items
> while inside of a cursor.  I have reduced the repro down to a test case that
> operates directly against LMDB.
>
> The test in question adds three groups of keys to the database (each with a
> different prefix) and then deletes the third group of keys.  The deletion is
> done by scanning all of the keys with that prefix using a cursor.
> mdb_cursor_del works correctly and deletes only the targeted keys, whereas
> mdb_del causes nine extra keys (not in that group/prefix) to be deleted.

> I have tested this code against LMDB 0.9.14 and Git head 8b46dcc.  Please let me
> know if you need any additional information.  Thank you for your time!

Unable to reproduce.

violino:~/OD/mdb/libraries/liblmdb> gcc -g -c michael-alyn-miller-151005.c
violino:~/OD/mdb/libraries/liblmdb> gcc -o mic michael-alyn-miller-151005.o 
-pthread liblmdb.a
violino:~/OD/mdb/libraries/liblmdb> ./mic
Num items after deletion using mdb_cursor_del: 44 (correct)
Num items after deletion using mdb_del: 44 (correct)
violino:~/OD/mdb/libraries/liblmdb> git log
commit 8b46dcc26d1e9897ab1da3a4a164cad4a4479a52
Author: Howard Chu <hyc@openldap.org>
Date:   Sun Oct 4 01:56:25 2015 +0100

     ITS#8258 fix rebalance/split

     The tree height can also increase during rebalance, not just shrink.
     This can happen if update_key needs to split a parent branch page.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Howard Chu 2015-10-06 07:04:46 UTC
hyc@symas.com wrote:

> Unable to reproduce.

Ah, the error went away if you reran with an existing DB. After deleting the 
DB the problem showed up. Fixed now in mdb.master.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 3 OpenLDAP project 2015-10-06 07:05:24 UTC
fixed in mdb.master
Comment 4 Howard Chu 2015-10-06 07:05:24 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 5 Quanah Gibson-Mount 2015-11-30 18:28:13 UTC
changed state Test to Closed