Issue 8299 - LMDB mdb_del data loss when used in a cursor (new repro)
Summary: LMDB mdb_del data loss when used in a cursor (new repro)
Status: VERIFIED FIXED
Alias: None
Product: LMDB
Classification: Unclassified
Component: liblmdb (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-03 22:02 UTC by malyn@strangegizmo.com
Modified: 2020-03-12 15:55 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description malyn@strangegizmo.com 2015-11-03 22:02:29 UTC
Full_Name: Michael Alyn Miller
Version: Git head
OS: Windows 8.1 x64, NixOS 14.12 x64
URL: ftp://ftp.openldap.org/incoming/michaelalynmiller-151103.c
Submission from: (NULL) (96.251.78.237)


I have a new repro of an ITS#8264-like situation, but with different input data
and with slightly different results.  Note that ITS#8264 was fixed with the
dataset provided in that bug; this is a different dataset and produces different
results.

The referenced test runs twice: once with mdb_cursor_del and once with mdb_del. 
Just like before, mdb_cursor_del performs correctly whereas mdb_del corrupts the
database.  In this case, mdb_del incorrectly removes one extra key (a key
prefixed with 04000000000000003a...), and incorrectly leaves 21 keys in place
that should have been deleted (those keys begin with 05000000000000003a...). 
This last item is the primary difference between these two bugs.

Unlike ITS#8264, you do not need to remove the database in between test runs in
order to repro this issue, although the results are still incorrect and will
eventually cause a bus error.
Comment 1 Howard Chu 2015-11-04 17:06:39 UTC
malyn@strangeGizmo.com wrote:
> Full_Name: Michael Alyn Miller
> Version: Git head
> OS: Windows 8.1 x64, NixOS 14.12 x64
> URL: ftp://ftp.openldap.org/incoming/michaelalynmiller-151103.c
> Submission from: (NULL) (96.251.78.237)
>
>
> I have a new repro of an ITS#8264-like situation, but with different input data
> and with slightly different results.  Note that ITS#8264 was fixed with the
> dataset provided in that bug; this is a different dataset and produces different
> results.
>
> The referenced test runs twice: once with mdb_cursor_del and once with mdb_del.
> Just like before, mdb_cursor_del performs correctly whereas mdb_del corrupts the
> database.  In this case, mdb_del incorrectly removes one extra key (a key
> prefixed with 04000000000000003a...), and incorrectly leaves 21 keys in place
> that should have been deleted (those keys begin with 05000000000000003a...).
> This last item is the primary difference between these two bugs.
>
> Unlike ITS#8264, you do not need to remove the database in between test runs in
> order to repro this issue, although the results are still incorrect and will
> eventually cause a bus error.

Thanks again for the excellent bug reports. Fixed now in git.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 OpenLDAP project 2015-11-04 17:09:18 UTC
fixed in mdb.master, mdb.RE/0.9
Comment 3 Howard Chu 2015-11-04 17:09:18 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 4 Howard Chu 2015-11-04 17:16:26 UTC
Howard Chu wrote:
> malyn@strangeGizmo.com wrote:
>> Full_Name: Michael Alyn Miller
>> Version: Git head
>> OS: Windows 8.1 x64, NixOS 14.12 x64
>> URL: ftp://ftp.openldap.org/incoming/michaelalynmiller-151103.c
>> Submission from: (NULL) (96.251.78.237)
>>
>>
>> I have a new repro of an ITS#8264-like situation, but with different input data
>> and with slightly different results.  Note that ITS#8264 was fixed with the
>> dataset provided in that bug; this is a different dataset and produces
>> different
>> results.
>>
>> The referenced test runs twice: once with mdb_cursor_del and once with mdb_del.
>> Just like before, mdb_cursor_del performs correctly whereas mdb_del corrupts
>> the
>> database.  In this case, mdb_del incorrectly removes one extra key (a key
>> prefixed with 04000000000000003a...), and incorrectly leaves 21 keys in place
>> that should have been deleted (those keys begin with 05000000000000003a...).
>> This last item is the primary difference between these two bugs.
>>
>> Unlike ITS#8264, you do not need to remove the database in between test runs in
>> order to repro this issue, although the results are still incorrect and will
>> eventually cause a bus error.
>
> Thanks again for the excellent bug reports. Fixed now in git.

I should also note that this is not a sensible use case. mdb_del has to 
construct a cursor to perform its work; if you already have a cursor open you 
should just use mdb_cursor_del. mdb_del is meant for one-shot use and is much 
slower than mdb_cursor_del when used in an iteration.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 5 malyn@strangegizmo.com 2015-11-04 18:46:38 UTC
> >> I have a new repro of an ITS#8264-like situation, but with different
> >> input data and with slightly different results.  Note that ITS#8264
> >> was fixed with the dataset provided in that bug; this is a different
> >> dataset and produces different results.
> >>
> >> The referenced test runs twice: once with mdb_cursor_del and once with
> mdb_del.
> >> Just like before, mdb_cursor_del performs correctly whereas mdb_del
> >> corrupts the database.  In this case, mdb_del incorrectly removes one
> >> extra key (a key prefixed with 04000000000000003a...), and
> >> incorrectly leaves 21 keys in place that should have been deleted
> >> (those keys begin with 05000000000000003a...).
> >> This last item is the primary difference between these two bugs.
> >>
> >> Unlike ITS#8264, you do not need to remove the database in between
> >> test runs in order to repro this issue, although the results are
> >> still incorrect and will eventually cause a bus error.
> >
> > Thanks again for the excellent bug reports. Fixed now in git.
> 
> I should also note that this is not a sensible use case. mdb_del has to
> construct a cursor to perform its work; if you already have a cursor open you
> should just use mdb_cursor_del. mdb_del is meant for one-shot use and is
> much slower than mdb_cursor_del when used in an iteration.

Thanks for this information.  I will make the change to use mdb_cursor_del when possible.  There are a couple of circumstances where I do not currently have access to the cursor though and will need to use mdb_del (or mdb_put, for updates) inside of the cursor.  I will look into passing the cursor up to my higher-level API callers though...

Note that I just reported ITS#8300, which was found after making the switch to mdb_cursor_del (and after applying the fix for ITS#8299).

Thanks,
Michael Alyn Miller

Comment 6 Quanah Gibson-Mount 2015-11-30 18:28:17 UTC
changed state Test to Closed