Full_Name: Pierre Chambart Version: OS: URL: Submission from: (NULL) (185.194.234.133) We have been trying LMDB compiled with the MDB_VL32 flag and encountered some MDB_CORRUPTED error that didn't occur without MDB_VL32. We could produce a minimal reproduction case: https://raw.githubusercontent.com/chambart/lmdb/vl32_readonly_fix/libraries/liblmdb/vl32_failure.c This is basically writing a key, reading it, writing something else then reading it again. After some investigation, this is how I understand what is happening: * The chunk being mapped for the first read contains 3 pages instead of the 16 pages that would be mapped without readonly. * The second write extend the size of the base to 5 pages * Finally the last read retrieve the mapped chunk from the first read, which has size 3, but assumes that it has size 5, hence reading some random value in memory past the end of the pages. It can be verified by asserting that id3.mcnt is larger than rem at the end of mdb_rpage_get, around this line: ok: p = (MDB_page *)((char *)id3.mptr + rem * env->me_psize); This does not seem to be the case. I made a dirty fix that seems to work. It's simply to remove those lines: /* don't map past last written page in read-only envs */ if ((env->me_flags & MDB_RDONLY) && pgno + MDB_RPAGE_CHUNK-1 > txn->mt_last_pgno) id3.mcnt = txn->mt_last_pgno + 1 - pgno; else This seems to be some optimization that avoids mapping the 16 pages of the chunk. But given that those pages are initialized, there doesn't seem to be much harm in mapping a few more pages in that relatively rare situation. The patch is provided in https://github.com/chambart/lmdb/commit/52e57a392e91935efe6d57337da09f23e4076924 If you want to reproduce the bug you can use the https://github.com/chambart/lmdb/tree/vl32_readonly_fix which also contains the patch to the makefile to build in 32bits with MDB_VL32. Thanks a lot, hoping that this can be of some help.
pierre.chambart@ocamlpro.com wrote: > Full_Name: Pierre Chambart > Version: > OS: > URL: > Submission from: (NULL) (185.194.234.133) > > > We have been trying LMDB compiled with the MDB_VL32 flag and encountered some > MDB_CORRUPTED error that didn't occur without MDB_VL32. > We could produce a minimal reproduction case: > https://raw.githubusercontent.com/chambart/lmdb/vl32_readonly_fix/libraries/liblmdb/vl32_failure.c > This is basically writing a key, reading it, writing something else then reading > it again. > > After some investigation, this is how I understand what is happening: > > * The chunk being mapped for the first read contains 3 pages instead of the 16 > pages that would be mapped without readonly. > * The second write extend the size of the base to 5 pages > * Finally the last read retrieve the mapped chunk from the first read, which has > size 3, but assumes that it has size 5, hence reading some random value in > memory past the end of the pages. > > It can be verified by asserting that id3.mcnt is larger than rem at the end of > mdb_rpage_get, around this line: > > ok: > p = (MDB_page *)((char *)id3.mptr + rem * env->me_psize); > > This does not seem to be the case. > > I made a dirty fix that seems to work. It's simply to remove those lines: > > /* don't map past last written page in read-only envs */ > if ((env->me_flags & MDB_RDONLY) && pgno + MDB_RPAGE_CHUNK-1 > > txn->mt_last_pgno) > id3.mcnt = txn->mt_last_pgno + 1 - pgno; > else > > This seems to be some optimization that avoids mapping the 16 pages of the > chunk. But given that those pages are > initialized, there doesn't seem to be much harm in mapping a few more pages in > that relatively rare situation. Please check git history before assuming the code is an optimization. These lines prevent a crash on Windows. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On 19/09/2018 17:35, Howard Chu wrote: > pierre.chambart@ocamlpro.com wrote: >> Full_Name: Pierre Chambart >> Version: >> OS: >> URL: >> Submission from: (NULL) (185.194.234.133) >> >> >> We have been trying LMDB compiled with the MDB_VL32 flag and encountered some >> MDB_CORRUPTED error that didn't occur without MDB_VL32. >> We could produce a minimal reproduction case: >> https://raw.githubusercontent.com/chambart/lmdb/vl32_readonly_fix/libraries/liblmdb/vl32_failure.c >> This is basically writing a key, reading it, writing something else then reading >> it again. >> >> After some investigation, this is how I understand what is happening: >> >> * The chunk being mapped for the first read contains 3 pages instead of the 16 >> pages that would be mapped without readonly. >> * The second write extend the size of the base to 5 pages >> * Finally the last read retrieve the mapped chunk from the first read, which has >> size 3, but assumes that it has size 5, hence reading some random value in >> memory past the end of the pages. >> >> It can be verified by asserting that id3.mcnt is larger than rem at the end of >> mdb_rpage_get, around this line: >> >> ok: >> p = (MDB_page *)((char *)id3.mptr + rem * env->me_psize); >> >> This does not seem to be the case. >> >> I made a dirty fix that seems to work. It's simply to remove those lines: >> >> /* don't map past last written page in read-only envs */ >> if ((env->me_flags & MDB_RDONLY) && pgno + MDB_RPAGE_CHUNK-1 > >> txn->mt_last_pgno) >> id3.mcnt = txn->mt_last_pgno + 1 - pgno; >> else >> >> This seems to be some optimization that avoids mapping the 16 pages of the >> chunk. But given that those pages are >> initialized, there doesn't seem to be much harm in mapping a few more pages in >> that relatively rare situation. > Please check git history before assuming the code is an optimization. > > These lines prevent a crash on Windows. > Sorry, about that. Also I'm not suggesting that this is a proper fix, rather that this is some clue to understand the problem. I'm happy producing a patch if you indicate what kind of fix you would like. -- Pierre