[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: LMDB and multiple processes



Brian G. Merrell wrote:
On Wed, Jun 4, 2014 at 10:22 AM, Howard Chu <hyc@symas.com
<mailto:hyc@symas.com>> wrote:

    Brian G. Merrell wrote:

        Hi all,

        First, I'm having trouble finding resources to answer a question like this
        myself, so please forgive me if I've missed something.


    http://symas.com/mdb/doc/


Thanks.  I did see and skim the API portion of the docs before asking, but I
was just having trouble knowing how the pieces fit together to solve a problem.

Skimming isn't going to cut it.

    Your reader process should be using read transactions.

OK, I interpret this as meaning that I need to pass the MDB_RDONLY flag to
mdb_txn_begin.  Is that correct?

Yes.

    In the actual LMDB API read transactions can be reused by their creating
    thread, so they are zero-cost after the first time. I don't know if any of
    the other language wrappers leverage this fact.

This helps a lot.  I will investigate what the case is with gomdb.


    Opening a DBI only needs to be done once per process. Opening per
    transaction would be stupid, like reopening a file handle on every request.


I suspected so.  The fact that mdb_dbi_open takes a transaction had me
confused a bit, because I thought I would need to pass in the new transaction
every time I got a transaction from mdb_txn_begin.

mdb_dbi_open takes a txn because it needs one if you're creating a DB for the first time. I.e., it must write metadata for the DB into the environment, and all writes to MDB must be inside a txn. But once that txn is committed, the DBI itself lives on until mdb_dbi_close. This is all already explained in the doc for mdb_dbi_open; if you hadn't skimmed you would have seen it already.

Most of this is only a concern when you're using named subDBs. The default unnamed DB always exists, so its DBI is always valid anyway.

I've refactored the reader to look like this:


env = NewEnv()
env.Open("/tmp/foo", 0, 0664)
txn = BeginTxn(nil, mdb.RDONLY) // parent txn is the nil arg
dbi = txn.DBIOpen(nil, 0)
txn.Abort()

You want mdb_txn_reset() here, not abort. Abort frees/destroys the txn handle so it cannot be reused.

while {
      txn = BeginTxn(nil, mdb.RDONLY) // parent txn is the nil arg
and here you want mdb_txn_renew(), to reuse the txn handle instead of creating a new one.

      for i = 0; i < n_entries; i++ {
          key = sprintf("Key-%d", i)
          val = txn.Get(dbi, key)
          print("%s: %s", key, value)
      }
      txn.Commit()
and you want mdb_txn_reset() here too, not commit. Commit also frees/destroys the txn handle.

      sleep(5)
}

You can abort or commit the txn during your process teardown phase to dispose of it.

env.DBIClose(dbi)


Now, I guess the big question that BeginTxn inside the loop is zero-cost.

Thanks for the tips so far Howard; it has been very helpful.

--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/