[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Speed when writing dublicate keys



Tom Kirchner wrote:
Hi,

I have a database with about 1 million integer keys. Now I want to append a lot of small values (each 3 bytes)
to random keys very often (tens of millions of times). How do I insert into the database the fastest?

Currently I am trying the following setup, but I wonder if that is optimal, because I see a decrease
in indexing speed after about 500000 random writes (here appends):
- I have split the whole key range into 16 databases (each within a different lmdb environment), so each range-db holds
   about 65000 keys
- I open each range-db with the flags: MDB_APPENDDUP | MDB_INTEGERKEY | MDB_INTEGERDUP |  MDB_DUPSORT
- then when I want to append 3 bytes to the value of a key, I use the following call:
   res = mdb_put( txn[dbnumber], db[dbnumber], &k, &v, MDB_APPENDDUP );

You will get fastest throughput by doing as much sequential activity as possible. I.e., process all of the values for a single key before moving on to a different key.

As documented, INTEGERKEY/INTEGERDUP are meant for native integers of the host machine. There are no CPUs in existence that natively use 3-byte integers. You should just use 4-bytes; note that LMDB pads odd-sized records internally anyway so using 3-byte values isn't going to save you any space.

If all your values are the same size you should also use DUPFIXED - that will save a significant amount of overhead.


Thank you for your help!
-Tom





--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/