[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#7789) Unreliable mdb_env_set_mapsize()

Full_Name: Hallvard B Furuseth
Version: LMDB 0.9.11
OS: Linux x86_64
Submission from: (NULL) (
Submitted by: hallvard

Mapsize changes do not work as described, do not reliably store the
mapsize in the map, and it's hard to see how it is supposed to work.
- Open an environment twice, in processes X and Y.
- X grows the map and writes (commits) something to the DB.  That
  MDB_meta gets the new mapsize.
- Y writes to the DB.  It does not get MDB_MAP_RESIZED like the doc
  says, nor does it carry forward X's MDB_meta.mm_mapsize change.
- Process Z opens the environment without doing set_mapsize(),
  and gets the original mapsize from the MDB_meta written by Y.

For that matter, from reading the doc I'd expect a mapsize change to
commit a txn with the new mapsize.  There's no mention that the change
(and the MDB_MAP_RESIZED) will wait for something to be committed.

mdb_txn_commit() writes nothing if the txn didn't change anything.
It needs to notice that there is a mapsize change to write.

The doc talks about shrinking the map, but reduced mapsizes are not
written to the datafile.  Only increases are written.

All in all, it looks to me like _changing_ the mapsize should be an
operation on a write transaction or invoke a write transaction, while
setting the size or catching up with a mapsize change can be an
environment operation.  That way it would be possible to make sense of
it.  A txn can do it when it has no cursors and no dirty WRITEMAP
pages (or WRITEMAP could spill all pages first).

BTW, I don't see the point of conditionally avoiding to write the
mapsize in mdb_env_write_meta() when full page gets written to disk
anyway - as long as txn_begin() stashes the mapsize from the original
meta so it knows what to write.  (It need not obey the mapsize at that
point, but it must carry a change forward.)