[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#8062) Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()



--bcaec53d526febcac5050fd27aa4
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

>
> Thanks for the report. Unfortunately, it appears the DB was constructed
> incorrectly at some time in the past, so even updating to later revisions
> of the code won't avoid the crash. It's more important now to find the
> sequence of steps that recreates the invalid DB in the first place, or
> verify that the same sequence doesn't crash using the 0.9.15 release
> candidate.


OK. I'll try to reproduce the steps.

Thanks.

G
*=E2=80=94  http://gawen.me/ <http://gawen.me/> **- +33 (0) 6 65 42 31 83*


On Tue, Feb 24, 2015 at 1:49 AM, Howard Chu <hyc@symas.com> wrote:

> g@wenarab.com wrote:
>
>> Full_Name: Gawen Arab
>> Version: 0.9.14
>> OS: OSX
>> URL: http://gawen.me/pub/lmdb_crash/gawen-150223.tar.xz
>> Submission from: (NULL) (77.151.59.7)
>>
>>
>> Hello,
>>
>> I'm currently developing a virtual filesystem whose metadata are stored
>> in a
>> LMDB 0.9.14 database. This software can currently run over Android, iOS,
>> Linux,
>> OSX and Windows.
>>
>> This software is currently used by 50 users and I received 11 crash
>> reports of
>> SIGSEV raised in LMDB. All reports come from OSX computers.
>>
>
> Thanks for the report. Unfortunately, it appears the DB was constructed
> incorrectly at some time in the past, so even updating to later revisions
> of the code won't avoid the crash. It's more important now to find the
> sequence of steps that recreates the invalid DB in the first place, or
> verify that the same sequence doesn't crash using the 0.9.15 release
> candidate.
>
>
>> mdb.c:5382: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
>>
>> This comes from this assertion
>> https://gitorious.org/mdb/mdb/source/2f587ae081d076e3707360c5db0865
>> 20c219d3ea:libraries/liblmdb/mdb.c#L5382
>> .
>>
>> This happens when the software iterates over keys in the database, the
>> same way
>> mdb_dump does, calling in loop mdb_cursor_get(..., MDB_NEXT).
>>
>> I managed to get back the LMDB db folder from one user which runs on OSX
>> 64bits.
>> I performed a mdb_dump of it with my Linux 64bits. mdb_dump crashes the
>> same
>> way. I tried with the lastest mdb_dump from the mdb.master branch (commi=
t
>> 3368d1f5e243225cba4d730fba19ff600798ebe3), but I have the same assertion=
.
>>
>> mdb.c:5599: Assertion 'IS_LEAF(mp)' failed in mdb_cursor_next()
>>
>> I attached to this issue the faulty database 'db/'.
>>
>> The output of mdb_dump on my machine is attached to this issue named as
>> the
>> file
>> 'mdb_dump.log'. I runned back mdb_dump with MDB_DEBUG defined to 2 and
>> mdb_debug
>> true. I attached the output of this new dump as the file
>> 'mdb_dump.debug.log'.
>> The interesting part:
>>
>> mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
>> mdb_cursor_next:5591 =3D=3D> cursor points to page 1946 with 50 keys, ke=
y
>> index 48
>> b5f1ff0700000000
>> 0700000000000000bc10000000000000000e000000000000860d00000000
>> 0000e308000000000000ee07000000000000a5070000000000000307000000000000
>> mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
>> mdb_cursor_next:5591 =3D=3D> cursor points to page 1946 with 50 keys, ke=
y
>> index 49
>> b6f1ff0700000000
>> 0700000000000000820e000000000000310e000000000000770d00000000
>> 0000660c000000000000cb0a0000000000002a080000000000000008000000000000
>> mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350
>> mdb_cursor_next:5579 =3D=3D=3D=3D=3D> move to next sibling page mdb_curs=
or_pop:5095
>> popped
>> page 1946 off db 1 cursor 0x1897350 mdb_cursor_sibling:5503 parent page
>> is page
>> 4643, index 16 mdb_cursor_sibling:5521 just moving to right index key 17
>> mdb_cursor_push:5104 pushing page 704 on db 1 cursor 0x1897350
>> mdb_cursor_next:5585 next page is 704, key index 0 mdb_cursor_next:5591
>> =3D=3D>
>> cursor points to page 704 with 7 keys, key index 0 mdb.c:5599: Assertion
>> 'IS_LEAF(mp)' failed in mdb_cursor_next()
>> %%0
>> I do not know much about internal LMDB design for now, so I'm struggling
>> to
>> understand the debug lines...
>>
>> Unfortunately, I do not know how to reproduce this bug, but it is a
>> recurring
>> one.
>>
>> Maybe the following information can help you,
>>
>> - My software opens the database in MDB_NOSYNC mode, and performs a
>>    mdb_env_sync(env, 1) every second or less. A mdb_env_sync can happen
>> at the
>>    same time of any other operation (i.e. mdb_txn_commit). I didn't put
>> any lock
>>    mechanism. Could this expln n such situation ?
>>
>> - mapsize is currently 512MiB. Previous versions of the software set to
>> smaller
>>    values (the database had not been rebuilt since).
>>
>> I'd be happy to provide more information if you need.
>>
>> Thank you for your support and this amazing IP.
>>
>> Regards
>>
>>
>>
>>
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.           http://www.symas.com
>   Director, Highland Sun     http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
>

--bcaec53d526febcac5050fd27aa4
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px =
0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-l=
eft-style:solid;padding-left:1ex"><span style=3D"font-size:12.8000001907349=
px">Thanks for the report. Unfortunately, it appears the DB was constructed=
 incorrectly at some time in the past, so even updating to later revisions =
of the code won&#39;t avoid the crash. It&#39;s more important now to find =
the sequence of steps that recreates the invalid DB in the first place, or =
verify that the same sequence doesn&#39;t crash using the 0.9.15 release ca=
ndidate.</span></blockquote><div><br></div><div>OK. I&#39;ll try to reprodu=
ce the steps.</div><div><br></div><div>Thanks.</div></div><div class=3D"gma=
il_extra"><br clear=3D"all"><div><div class=3D"gmail_signature"><div dir=3D=
"ltr"><div><div><div><div><div><font size=3D"1"><span style=3D"color:rgb(68=
,68,68)"><span style=3D"font-family:trebuchet ms,sans-serif"><font><span st=
yle=3D"font-weight:normal">G</span><span style=3D"font-weight:normal"></spa=
n></font><b><span style=3D"font-weight:normal"><b><br></b></span></b><i><sp=
an style=3D"font-weight:normal"></span></i></span></span></font><span style=
=3D"font-family:trebuchet ms,sans-serif"><i><font size=3D"1"><span style=3D=
"color:rgb(68,68,68)"><span style=3D"font-weight:normal">=E2=80=94=C2=A0 </=
span></span></font><span style=3D"font-weight:normal"><font size=3D"1"><spa=
n style=3D"color:rgb(68,68,68)"><a href=3D"http://gawen.me/"; target=3D"_bla=
nk">http://gawen.me/</a></span></font> </span></i></span><span style=3D"fon=
t-family:trebuchet ms,sans-serif"><i><font size=3D"1"><span style=3D"color:=
rgb(68,68,68)"><span style=3D"font-weight:normal">- +33 (0) 6 65 42 31 83</=
span></span></font></i></span></div><span style=3D"font-family:trebuchet ms=
,sans-serif"><i><font size=3D"1"><span style=3D"color:rgb(68,68,68)"><span =
style=3D"font-weight:normal"><font></font></span></span></font><font size=
=3D"1"><span style=3D"color:rgb(68,68,68)"><span style=3D"font-weight:norma=
l"></span></span></font></i></span><span style=3D"font-family:trebuchet ms,=
sans-serif"><span style=3D"border-collapse:collapse"><span style=3D"font-fa=
mily:trebuchet ms,sans-serif"><i><font size=3D"1"><span style=3D"color:rgb(=
68,68,68)"><span style=3D"font-weight:normal"></span></span></font></i></sp=
an></span></span><div><span style=3D"font-family:trebuchet ms,sans-serif"><=
span style=3D"border-collapse:collapse"></span></span></div></div></div></d=
iv></div><br></div></div></div>
<br><div class=3D"gmail_quote">On Tue, Feb 24, 2015 at 1:49 AM, Howard Chu =
<span dir=3D"ltr">&lt;<a href=3D"mailto:hyc@symas.com"; target=3D"_blank">hy=
c@symas.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><a href=
=3D"mailto:g@wenarab.com"; target=3D"_blank">g@wenarab.com</a> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
Full_Name: Gawen Arab<br>
Version: 0.9.14<br>
OS: OSX<br>
URL: <a href=3D"http://gawen.me/pub/lmdb_crash/gawen-150223.tar.xz"; target=
=3D"_blank">http://gawen.me/pub/lmdb_<u></u>crash/gawen-150223.tar.xz</a><b=
r>
Submission from: (NULL) (77.151.59.7)<br>
<br>
<br>
Hello,<br>
<br>
I&#39;m currently developing a virtual filesystem whose metadata are stored=
 in a<br>
LMDB 0.9.14 database. This software can currently run over Android, iOS, Li=
nux,<br>
OSX and Windows.<br>
<br>
This software is currently used by 50 users and I received 11 crash reports=
 of<br>
SIGSEV raised in LMDB. All reports come from OSX computers.<br>
</blockquote>
<br>
Thanks for the report. Unfortunately, it appears the DB was constructed inc=
orrectly at some time in the past, so even updating to later revisions of t=
he code won&#39;t avoid the crash. It&#39;s more important now to find the =
sequence of steps that recreates the invalid DB in the first place, or veri=
fy that the same sequence doesn&#39;t crash using the 0.9.15 release candid=
ate.<br>
<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<br>
mdb.c:5382: Assertion &#39;IS_LEAF(mp)&#39; failed in mdb_cursor_next()<br>
<br>
This comes from this assertion<br>
<a href=3D"https://gitorious.org/mdb/mdb/source/2f587ae081d076e3707360c5db0=
86520c219d3ea:libraries/liblmdb/mdb.c#L5382" target=3D"_blank">https://gito=
rious.org/mdb/mdb/<u></u>source/<u></u>2f587ae081d076e3707360c5db0865<u></u=
>20c219d3ea:libraries/liblmdb/<u></u>mdb.c#L5382</a><br>
.<br>
<br>
This happens when the software iterates over keys in the database, the same=
 way<br>
mdb_dump does, calling in loop mdb_cursor_get(..., MDB_NEXT).<br>
<br>
I managed to get back the LMDB db folder from one user which runs on OSX<br=
>
64bits.<br>
I performed a mdb_dump of it with my Linux 64bits. mdb_dump crashes the sam=
e<br>
way. I tried with the lastest mdb_dump from the mdb.master branch (commit<b=
r>
3368d1f5e243225cba4d730fba19ff<u></u>600798ebe3), but I have the same asser=
tion.<br>
<br>
mdb.c:5599: Assertion &#39;IS_LEAF(mp)&#39; failed in mdb_cursor_next()<br>
<br>
I attached to this issue the faulty database &#39;db/&#39;.<br>
<br>
The output of mdb_dump on my machine is attached to this issue named as the=
<br>
file<br>
&#39;mdb_dump.log&#39;. I runned back mdb_dump with MDB_DEBUG defined to 2 =
and<br>
mdb_debug<br>
true. I attached the output of this new dump as the file &#39;mdb_dump.debu=
g.log&#39;.<br>
The interesting part:<br>
<br>
mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350<br>
mdb_cursor_next:5591 =3D=3D&gt; cursor points to page 1946 with 50 keys, ke=
y index 48<br>
b5f1ff0700000000<br>
0700000000000000bc100000000000<u></u>00000e000000000000860d00000000<u></u>0=
000e308000000000000ee07000000<u></u>000000a50700000000000003070000<u></u>00=
000000<br>
mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350<br>
mdb_cursor_next:5591 =3D=3D&gt; cursor points to page 1946 with 50 keys, ke=
y index 49<br>
b6f1ff0700000000<br>
0700000000000000820e0000000000<u></u>00310e000000000000770d00000000<u></u>0=
000660c000000000000cb0a000000<u></u>0000002a0800000000000000080000<u></u>00=
000000<br>
mdb_cursor_next:5574 cursor_next: top page is 1946 in cursor 0x1897350<br>
mdb_cursor_next:5579 =3D=3D=3D=3D=3D&gt; move to next sibling page mdb_curs=
or_pop:5095<br>
popped<br>
page 1946 off db 1 cursor 0x1897350 mdb_cursor_sibling:5503 parent page is =
page<br>
4643, index 16 mdb_cursor_sibling:5521 just moving to right index key 17<br=
>
mdb_cursor_push:5104 pushing page 704 on db 1 cursor 0x1897350<br>
mdb_cursor_next:5585 next page is 704, key index 0 mdb_cursor_next:5591 =3D=
=3D&gt;<br>
cursor points to page 704 with 7 keys, key index 0 mdb.c:5599: Assertion<br=
>
&#39;IS_LEAF(mp)&#39; failed in mdb_cursor_next()<br>
%%0<br>
I do not know much about internal LMDB design for now, so I&#39;m strugglin=
g to<br>
understand the debug lines...<br>
<br>
Unfortunately, I do not know how to reproduce this bug, but it is a recurri=
ng<br>
one.<br>
<br>
Maybe the following information can help you,<br>
<br>
- My software opens the database in MDB_NOSYNC mode, and performs a<br>
=C2=A0 =C2=A0mdb_env_sync(env, 1) every second or less. A mdb_env_sync can =
happen at the<br>
=C2=A0 =C2=A0same time of any other operation (i.e. mdb_txn_commit). I didn=
&#39;t put any lock<br>
=C2=A0 =C2=A0mechanism. Could this expln n such situation ?<br>
<br>
- mapsize is currently 512MiB. Previous versions of the software set to sma=
ller<br>
=C2=A0 =C2=A0values (the database had not been rebuilt since).<br>
<br>
I&#39;d be happy to provide more information if you need.<br>
<br>
Thank you for your support and this amazing IP.<br>
<br>
Regards<br>
<br>
<br>
<br><span class=3D"HOEnZb"><font color=3D"#888888">
</font></span></blockquote><span class=3D"HOEnZb"><font color=3D"#888888">
<br>
<br>
-- <br>
=C2=A0 -- Howard Chu<br>
=C2=A0 CTO, Symas Corp.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<a href=3D"=
http://www.symas.com"; target=3D"_blank">http://www.symas.com</a><br>
=C2=A0 Director, Highland Sun=C2=A0 =C2=A0 =C2=A0<a href=3D"http://highland=
sun.com/hyc/" target=3D"_blank">http://highlandsun.com/hyc/</a><br>
=C2=A0 Chief Architect, OpenLDAP=C2=A0 <a href=3D"http://www.openldap.org/p=
roject/" target=3D"_blank">http://www.openldap.org/<u></u>project/</a><br>
</font></span></blockquote></div><br></div>

--bcaec53d526febcac5050fd27aa4--