[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#8179) LMDB: Feature Request: mdb_env_check_mapsize(env)

Full_Name: Scott Kevill
Version: LMDB 0.9.14
OS: CentOS 5-7
URL: ftp://ftp.openldap.org/incoming/scott-kevill-150628.patch
Submission from: (NULL) (

mdb_txn_begin / mdb_txn_renew will return MDB_MAP_RESIZED for the current
process if another process has changed the mapsize AND the data has grown past
the current process mapsize. However, by that point, in a multithreaded
environment it may be difficult to wind back to a "safe" point of no active txns
in order to call mdb_env_set_mapsize(env, 0) to update the mapsize.

    int  mdb_env_check_mapsize(MDB_env *env);
to return either MDB_SUCCESS or MDB_MAP_RESIZED. This should be lightweight
enough to call frequently enough to decide whether to initiate a "safe" point to
enable calling mdb_env_set_mapsize(env, 0). Return MDB_SUCCESS if the MDB has
not been opened yet.

- Process W is a writer
- Process R is a reader with multiple reader threads (read-only)
- Both processes are running and using a 2GB mapsize MDB
- Data is currently 1GB
- Ops decide W should increase mapsize to 3GB allowing for more expansion
- W calls mdb_env_set_mapsize(env, 3GB)
- R's mapsize remains at 2GB%0 D Data grows
- R:1 thread begins a moderately long read-only txn
- Data grows past 2GB
- R:2 thread tries to begin a read-only txn, but fails with MDB_MAP_RESIZED
- R:2 thread cannot call mdb_env_set_mapsize(env, 0), because of R:1's txn
- A major control flow issue exists now, as R:2 may be deep into work, but can't
continue nor back out

With mdb_env_check_mapsize(), R could periodically test it and if needed,
initiate a "safe" point where none of R's threads have active txns. This could
be done long before the data size actually reaches R's original mapsize (ie.
when data is ~1GB rather >2GB).

- Enforcing synchronisation to a "safe" point might be relatively expensive, so
it would be important to know if the mapsize has changed first. ie. Making
mdb_env_set_mapsize(env, 0) lightweight for the unchanged mapsize case isn't

I've attached a sample patch against master (scott-kevill-150628.patch). I've
tested it against 0.9.14. The code is trivial and is based on the behaviour of
mdb_env_set_mapsize(env, 0). I didn't use git for the patch, so not too
difficult to get working. Less than 10 new lines, but for the record, I release
it to public domain. Feel free to massage it as needed.