[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#8898) mdb_get on corrupted LMDB file crash the process
- To: openldap-its@OpenLDAP.org
- Subject: Re: (ITS#8898) mdb_get on corrupted LMDB file crash the process
- From: hyc@symas.com
- Date: Wed, 12 Sep 2018 11:55:56 +0000
- Auto-submitted: auto-generated (OpenLDAP-ITS)
Irad K wrote:
> Hi Howard and thanks for the info.=C2=A0
>=20
> I've read your insights about the corruption nature and I'd like to see=
if I can write some sort of integrity check before I try to open the fil=
e.=C2=A0
> Perhaps you could guide me where to look for the file format, where can=
I find all the meta-pages and in which offset can I find the used pages =
?
Read the source yourself if you want to know the underlying format. Other=
wise, just use mdb_env_info().
=C2=A0
> Basically, I wish to check the latest meta-page and see if the number o=
f pages correspond with the file actual size.=C2=A0
>=20
> Second, as I've stated, my target operating system is macOS.. are you f=
amiliar of any known issues with file atomicity in this particular OS ?=C2=
=A0
> Perhaps you could tell me how it's implemented under the hood to achiev=
e data consistency if I'm using the database using MDB_NOMETASYNC, so I f=
urther
> investigate this issue and address it to Apple if necessary.=20
I've seen other reports of corruptions on MacOSX. Since they're in SQLite=
I haven't investigated further.
http://sqlite.1065341.n5.nabble.com/SQLITE-vs-OSX-mmap-inevitable-catalog=
-corruption-td85620.html
http://sqlite.1065341.n5.nabble.com/Re-Database-corruption-and-PRAGMA-ful=
lfsync-on-macOS-td95366i20.html
>=20
> Irad
>=20
> On Mon, Sep 10, 2018 at 8:44 PM Howard Chu <hyc@symas.com <mailto:hyc@s=
ymas.com>> wrote:
>=20
> iradization@gmail.com <mailto:iradization@gmail.com> wrote:
> > Full_Name: Irad.k
> > Version: recent from GitHub.
> > OS: macOS
> > URL: ftp://ftp.openldap.org/incoming/
> > Submission from: (NULL) (82.81.84.130)
> >
> >
> > Hi,
> >
> > I'm working with LMDB with the following flags :=C2=A0 MDB_NOMETA=
SYNC | MDB_NOSUBDIR
> > | MDB_NOTLS.
> >
> > Somehow, even though the DB should be ACI(without the D),=C2=A0 i=
t got corrupted
> > after recovering from kernel panic, and It crashes my process whe=
n trying to
> > access one of the records (see crash log below).
> >
> > Here's a link to the file :
> >
> > https://drive.google.com/file/d/12Q3KYYrapiJOgiaccnDL3tQACkqrM3C5=
/view?usp=3Dsharing
>=20
> This file is only 6 pages long, but the latest meta page claims tha=
t it's using 11 pages.
> Looking at the previous meta page, it claims that 9 pages were used=
.
>=20
> Clearly your OS failed to write the complete contents out before it=
panicked. At the
> very least the file should have been 9 pages long. Sounds like you =
have an OS bug.
>=20
> > According to the crash log from the process, It can clearly be se=
en that the
> > invalid address reside inside the mapped file region which is the=
lmdb mapped
> > file, but still I get KERN_MEMORY_ERROR on that address.
> >
> >>From what I know, an attempt to access address within the mapped =
range can
> > either retrieve the page contents directly from memory (if it's a=
lready there),
> > or trigger page fault trap that eventually lead to reading the mi=
ssing data from
> > disk and return it to process as well.
> >
> > One thing that raise some concerns is that the file size is only =
24k and the
> > mapping spans over 256M. However, the file's meta data seems to b=
e coherent to
> > file contents.
> >
> > Any idea how did it happen, and what exactly in the file cause th=
is corruption ?
> >
> >
> > CRASH LOG :
> > -----------------------------------------------------------------=
-----
> > Exception Type:=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXC_BAD_ACCESS (SIGBUS=
)
> > Exception Codes:=C2=A0 =C2=A0 =C2=A0 =C2=A0KERN_MEMORY_ERROR at 0=
x000000010648800a
> > Exception Note:=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXC_CORPSE_NOTIFY
> >
> > Termination Signal:=C2=A0 =C2=A0 Bus error: 10
> > Termination Reason:=C2=A0 =C2=A0 Namespace SIGNAL, Code 0xa
> > Terminating Process:=C2=A0 =C2=A0exc handler [0]
> >
> > VM Regions Near 0x10648800a:
> >
> > __LINKEDIT=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00000000=
106464000-000000010647f000 [=C2=A0 108K] r--/rwx SM=3DCOW
> >=C2=A0 ^Z^C [/usr/lib/dyld]
> > --> mapped file=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 00000001=
0647f000-000000011647f000 [256.0M] r--/rwx
> > SM=3DPRV=C2=A0 Object_id=3D9034edd9
> > STACK GUARD=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 000070000f2b=
3000-000070000f2b4000 [=C2=A0 =C2=A0 4K] ---/rwx SM=3DNUL
> >=C2=A0 stack guard for thread 1
> >
> > Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
> > 0=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00x0000000101666756 mdb_page_search_root + =
39
> > 1=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00x00000001016660f7 mdb_page_search + 182
> > 2=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00x00000001016614de mdb_cursor_set + 88
> > 3=C2=A0 =C2=A0myprog=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00x0000000101661476 mdb_get + 134
> >
> >
> >
>=20
>=20
> --=20
> =C2=A0 -- Howard Chu
> =C2=A0 CTO, Symas Corp.=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0htt=
p://www.symas.com
> =C2=A0 Director, Highland Sun=C2=A0 =C2=A0 =C2=A0http://highlandsun=
.com/hyc/
> =C2=A0 Chief Architect, OpenLDAP=C2=A0 http://www.openldap.org/proj=
ect/
>=20
--=20
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/