[Date Prev][Date Next] [Chronological] [Thread] [Top]

Aw: Re: Speed when writing dublicate keys

To: openldap-technical@openldap.org
Subject: Aw: Re: Speed when writing dublicate keys
From: "Tom Kirchner" <tokirc@gmx.net>
Date: Fri, 13 Feb 2015 11:35:26 +0100
Importance: normal
In-reply-to: <54DDC215.30607@symas.com>
References: <trinity-b59e572b-da4a-4b66-94f1-6493ac7fc8e0-1423815724863@3capp-gmx-bs69>, <54DDC215.30607@symas.com>
Sensitivity: Normal

Tom kirchner wrote:
> - I have split the whole key range into 16 databases (each within a different lmdb environment), so each range-db holds
> about 65000 keys

Is it even worth it to split the keys into different databases? Is 1mio keys even a large number where one would expect a significant performance improvement when splitting the key range? If the b+tree depth (perfectly balanced) would be 4.8 for 65000 keys, its 6 for 1mio keys - not much deeper (if I calculated correctly).

Howard Chu wrote:
> You will get fastest throughput by doing as much sequential activity as possible. I.e., process all of the values for a single key before moving on to a different key.

I will try to do that but thats difficult in my case. I read that dublicate key's values are appended to the same binary blob of that key. Is that correct? Is it maybe worth a try to index the data in batches: index 500000 values (using append to dublicate key) into a temp db and then sequentially writing the values from the tmp db to the final db?

Howard Chu wrote:
> If all your values are the same size you should also use DUPFIXED - that will save a significant amount of overhead.

Thanks for the hint, I must have overlooked it in the documentation. Will try that!

I read the documentation of lmdb at http://symas.com/mdb/doc/index.html but sometimes I find it hard to get my head around the descriptions of the various flags one can use (e.g. for db open and put/get), especially MDB_INTEGERKEY and the sort. Some additional explanations might be helpful, such as "How is the key size for fixed-sized integer keys set?". Other than that I find the documentation very thourough. Maybe a glossary might also be helpful.

-Tom

References:
- Speed when writing dublicate keys
  - From: "Tom Kirchner" <tokirc@gmx.net>
- Re: Speed when writing dublicate keys
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: Inconsistent answer from LDAP Server
Next by Date: Re: Is the monitor backend writable?
Index(es):
- Chronological
- Thread