[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#8898) mdb_get on corrupted LMDB file crash the process



--000000000000d405310575a6de5e
Content-Type: text/plain; charset="UTF-8"

Hi Howard and thanks for the info.

I've read your insights about the corruption nature and I'd like to see if
I can write some sort of integrity check before I try to open the file.
Perhaps you could guide me where to look for the file format, where can I
find all the meta-pages and in which offset can I find the used pages ?
Basically, I wish to check the latest meta-page and see if the number of
pages correspond with the file actual size.

Second, as I've stated, my target operating system is macOS.. are you
familiar of any known issues with file atomicity in this particular OS ?
Perhaps you could tell me how it's implemented under the hood to achieve
data consistency if I'm using the database using MDB_NOMETASYNC, so I
further investigate this issue and address it to Apple if necessary.

Irad

On Mon, Sep 10, 2018 at 8:44 PM Howard Chu <hyc@symas.com> wrote:

> iradization@gmail.com wrote:
> > Full_Name: Irad.k
> > Version: recent from GitHub.
> > OS: macOS
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (82.81.84.130)
> >
> >
> > Hi,
> >
> > I'm working with LMDB with the following flags :  MDB_NOMETASYNC |
> MDB_NOSUBDIR
> > | MDB_NOTLS.
> >
> > Somehow, even though the DB should be ACI(without the D),  it got
> corrupted
> > after recovering from kernel panic, and It crashes my process when
> trying to
> > access one of the records (see crash log below).
> >
> > Here's a link to the file :
> >
> >
> https://drive.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACkqrM3C5/view?usp=sharing
>
> This file is only 6 pages long, but the latest meta page claims that it's
> using 11 pages.
> Looking at the previous meta page, it claims that 9 pages were used.
>
> Clearly your OS failed to write the complete contents out before it
> panicked. At the
> very least the file should have been 9 pages long. Sounds like you have an
> OS bug.
>
> > According to the crash log from the process, It can clearly be seen that
> the
> > invalid address reside inside the mapped file region which is the lmdb
> mapped
> > file, but still I get KERN_MEMORY_ERROR on that address.
> >
> >>From what I know, an attempt to access address within the mapped range
> can
> > either retrieve the page contents directly from memory (if it's already
> there),
> > or trigger page fault trap that eventually lead to reading the missing
> data from
> > disk and return it to process as well.
> >
> > One thing that raise some concerns is that the file size is only 24k and
> the
> > mapping spans over 256M. However, the file's meta data seems to be
> coherent to
> > file contents.
> >
> > Any idea how did it happen, and what exactly in the file cause this
> corruption ?
> >
> >
> > CRASH LOG :
> > ----------------------------------------------------------------------
> > Exception Type:        EXC_BAD_ACCESS (SIGBUS)
> > Exception Codes:       KERN_MEMORY_ERROR at 0x000000010648800a
> > Exception Note:        EXC_CORPSE_NOTIFY
> >
> > Termination Signal:    Bus error: 10
> > Termination Reason:    Namespace SIGNAL, Code 0xa
> > Terminating Process:   exc handler [0]
> >
> > VM Regions Near 0x10648800a:
> >
> > __LINKEDIT             0000000106464000-000000010647f000 [  108K]
> r--/rwx SM=COW
> >  ^Z^C [/usr/lib/dyld]
> > --> mapped file            000000010647f000-000000011647f000 [256.0M]
> r--/rwx
> > SM=PRV  Object_id=9034edd9
> > STACK GUARD            000070000f2b3000-000070000f2b4000 [    4K]
> ---/rwx SM=NUL
> >  stack guard for thread 1
> >
> > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
> > 0   myprog                     0x0000000101666756 mdb_page_search_root +
> 39
> > 1   myprog                     0x00000001016660f7 mdb_page_search + 182
> > 2   myprog                     0x00000001016614de mdb_cursor_set + 88
> > 3   myprog                     0x0000000101661476 mdb_get + 134
> >
> >
> >
>
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.           http://www.symas.com
>   Director, Highland Sun     http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
>

--000000000000d405310575a6de5e
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><div dir=3D"ltr">Hi Howard and thanks for=
 the info.=C2=A0<div><br><div>I&#39;ve read your insights about the corrupt=
ion nature and I&#39;d like to see if I can write some sort of integrity ch=
eck before I try to open the file.=C2=A0</div><div>Perhaps you could guide =
me where to look for the file format, where can I find all the meta-pages a=
nd in which offset can I find the used pages ?=C2=A0</div><div>Basically, I=
 wish to check the latest meta-page and see if the number of pages correspo=
nd with the file actual size.=C2=A0<br></div><div><br></div><div>Second, as=
 I&#39;ve stated, my target operating system is macOS.. are you familiar of=
 any known issues with file atomicity in this particular OS ?=C2=A0</div></=
div><div><div>Perhaps you could tell me how it&#39;s implemented under the =
hood to achieve data consistency if I&#39;m using the database using MDB_NO=
METASYNC, so I further investigate this issue and address it to Apple if ne=
cessary.=C2=A0</div></div><div><br></div><div>Irad</div></div></div></div><=
br><div class=3D"gmail_quote"><div dir=3D"ltr">On Mon, Sep 10, 2018 at 8:44=
 PM Howard Chu &lt;<a href=3D"mailto:hyc@symas.com";>hyc@symas.com</a>&gt; w=
rote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex"><a href=3D"mailto:iradization=
@gmail.com" target=3D"_blank">iradization@gmail.com</a> wrote:<br>
&gt; Full_Name: Irad.k<br>
&gt; Version: recent from GitHub.<br>
&gt; OS: macOS<br>
&gt; URL: <a href=3D"ftp://ftp.openldap.org/incoming/"; rel=3D"noreferrer" t=
arget=3D"_blank">ftp://ftp.openldap.org/incoming/</a><br>
&gt; Submission from: (NULL) (82.81.84.130)<br>
&gt; <br>
&gt; <br>
&gt; Hi, <br>
&gt; <br>
&gt; I&#39;m working with LMDB with the following flags :=C2=A0 MDB_NOMETAS=
YNC | MDB_NOSUBDIR<br>
&gt; | MDB_NOTLS.<br>
&gt; <br>
&gt; Somehow, even though the DB should be ACI(without the D),=C2=A0 it got=
 corrupted<br>
&gt; after recovering from kernel panic, and It crashes my process when try=
ing to<br>
&gt; access one of the records (see crash log below). <br>
&gt; <br>
&gt; Here&#39;s a link to the file : <br>
&gt; <br>
&gt; <a href=3D"https://drive.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACk=
qrM3C5/view?usp=3Dsharing" rel=3D"noreferrer" target=3D"_blank">https://dri=
ve.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACkqrM3C5/view?usp=3Dsharing</=
a><br>
<br>
This file is only 6 pages long, but the latest meta page claims that it&#39=
;s using 11 pages.<br>
Looking at the previous meta page, it claims that 9 pages were used.<br>
<br>
Clearly your OS failed to write the complete contents out before it panicke=
d. At the<br>
very least the file should have been 9 pages long. Sounds like you have an =
OS bug.<br>
<br>
&gt; According to the crash log from the process, It can clearly be seen th=
at the<br>
&gt; invalid address reside inside the mapped file region which is the lmdb=
 mapped<br>
&gt; file, but still I get KERN_MEMORY_ERROR on that address.<br>
&gt; <br>
&gt;&gt;From what I know, an attempt to access address within the mapped ra=
nge can<br>
&gt; either retrieve the page contents directly from memory (if it&#39;s al=
ready there),<br>
&gt; or trigger page fault trap that eventually lead to reading the missing=
 data from<br>
&gt; disk and return it to process as well.<br>
&gt; <br>
&gt; One thing that raise some concerns is that the file size is only 24k a=
nd the<br>
&gt; mapping spans over 256M. However, the file&#39;s meta data seems to be=
 coherent to<br>
&gt; file contents.<br>
&gt; <br>
&gt; Any idea how did it happen, and what exactly in the file cause this co=
rruption ?<br>
&gt; <br>
&gt; <br>
&gt; CRASH LOG :<br>
&gt; ----------------------------------------------------------------------=
<br>
&gt; Exception Type:=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXC_BAD_ACCESS (SIGBUS)<br>
&gt; Exception Codes:=C2=A0 =C2=A0 =C2=A0 =C2=A0KERN_MEMORY_ERROR at 0x0000=
00010648800a<br>
&gt; Exception Note:=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXC_CORPSE_NOTIFY<br>
&gt; <br>
&gt; Termination Signal:=C2=A0 =C2=A0 Bus error: 10<br>
&gt; Termination Reason:=C2=A0 =C2=A0 Namespace SIGNAL, Code 0xa<br>
&gt; Terminating Process:=C2=A0 =C2=A0exc handler [0]<br>
&gt; <br>
&gt; VM Regions Near 0x10648800a:<br>
&gt; <br>
&gt; __LINKEDIT=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0000000010646=
4000-000000010647f000 [=C2=A0 108K] r--/rwx SM=3DCOW<br>
&gt;=C2=A0 ^Z^C [/usr/lib/dyld]<br>
&gt; --&gt; mapped file=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0000000106=
47f000-000000011647f000 [256.0M] r--/rwx<br>
&gt; SM=3DPRV=C2=A0 Object_id=3D9034edd9<br>
&gt; STACK GUARD=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 000070000f2b3000-=
000070000f2b4000 [=C2=A0 =C2=A0 4K] ---/rwx SM=3DNUL<br>
&gt;=C2=A0 stack guard for thread 1<br>
&gt; <br>
&gt; Thread 0 Crashed:: Dispatch queue: com.apple.main-thread<br>
&gt; 0=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A00x0000000101666756 mdb_page_search_root + 39<br>
&gt; 1=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A00x00000001016660f7 mdb_page_search + 182<br>
&gt; 2=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A00x00000001016614de mdb_cursor_set + 88<br>
&gt; 3=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A00x0000000101661476 mdb_get + 134<br>
&gt; <br>
&gt; <br>
&gt; <br>
<br>
<br>
-- <br>
=C2=A0 -- Howard Chu<br>
=C2=A0 CTO, Symas Corp.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<a href=3D"=
http://www.symas.com"; rel=3D"noreferrer" target=3D"_blank">http://www.symas=
.com</a><br>
=C2=A0 Director, Highland Sun=C2=A0 =C2=A0 =C2=A0<a href=3D"http://highland=
sun.com/hyc/" rel=3D"noreferrer" target=3D"_blank">http://highlandsun.com/h=
yc/</a><br>
=C2=A0 Chief Architect, OpenLDAP=C2=A0 <a href=3D"http://www.openldap.org/p=
roject/" rel=3D"noreferrer" target=3D"_blank">http://www.openldap.org/proje=
ct/</a><br>
</blockquote></div>

--000000000000d405310575a6de5e--