[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: mdb fragmentation

To: Quanah Gibson-Mount <quanah@symas.com>, Geert Hendrickx <geert@hendrickx.be>
Subject: Re: mdb fragmentation
From: Klaus Malorny <Klaus.Malorny@knipp.de>
Date: Mon, 15 Jan 2018 11:33:15 +0100
Cc: karl.buchner@synacor.com, openldap-technical <openldap-technical@openldap.org>
Content-language: en-US
In-reply-to: <CA13EA4DE42F0DDC722631FA@[192.168.1.30]>
References: <20170824115332.GA24591@vera.ghen.be> <WM!f60c69a7b936abd4e28d4f8e6b7e88c99abd7edd8259fef9479778f7775493d33f50b3264410fb87f92e3f452a3e9731!@mailstronghold-1.zmailcloud.com> <1556375279.52119069.1503621017507.JavaMail.zimbra@symas.com> <WM!3e888f38d9b5e6fee4b7425121eb93c87cc6203e5182936ff9ce5450aef58f792c6f3cce7bcd1a15f32ea1d829c2459a!@mailstronghold-3.zmailcloud.com> <CA13EA4DE42F0DDC722631FA@[192.168.1.30]>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:59.0) Gecko/20100101 Thunderbird/59.0a1

On 03.01.18 00:06, Quanah Gibson-Mount wrote:

I wanted to follow up on this, based on doing an examination of Geert'sdatabase, and other affected databases. Geert already has this answer, but it'suseful for the general OpenLDAP community.
This fragmentation problem is not common. It depends entirely on size of theentries in the database. The issue arises when entries in the LDAP DB aregreater than the LMDB pagesize (Usually 4KB) and then have frequent updates.This most often occurs in one of two ways:
a) multi-valued attributes with a large number of values
b) a very large single-valued attribute (I.e., binary data)
For the first problem (a), there is code in the 2.5 release to address thisproblem, called multival. This feature puts multi-valued attributes with a(configurable) number of values into its own sub-database. For (b), there's notreally a solutionn, but it's pretty rare.
So for those who have entries that are < 4 KB, they will never see thisproblem. Note that this is the size of the binary entry on disk, not the sizeof the entry when exported to LDIF. The binary size is generally significantlysmaller than the LDIF version.
--Quanah

Hi,

I did some own research on this issue in the meantime and gained some moredetails about overflow/bigdata: A constant in the LMDB code defines that eachtree page must be able to store at least two tree nodes. So each node may not belarger than half of the page size (minus the page header size). As the node alsocontains the key data, the key contributes to the size of the node. With amaximum of 511 bytes for the key, only data roughly below 1500 bytes will bealways stored within the tree and not in overflow pages.

In respect to overflow pages, it needs to be considered that they contain asingle header also. Choosing exactly a multiple of the page size as the datasize will thus definitely waste nearly a full page.

Unfortunately, the various constants and calculated values can not be retrievedvia the regular API, so there is no safe way to deal with it from a user'sperspective.

I have not yet investigated how LMDB stores released runs of pages and whatstrategies are used for allocation, specifically, whether only exactly matchingsizes are taken or whether larger runs are broken up. In any way, I do notexpect a fragmentation problem if only data is used which requires only a few pages.

For the project where I am using LMDB, there is a certain likeliness that thedata may be megabytes large. I currently plan to revise the way the data isstored and to split it up into multiple chunks, each represented by anindividual database entry. The chunks will be dimensioned that the number ofoverflow pages will be always a power of two, e.g. 8, 16 and 32 pages, even ifit creates unused space within the chunk. This will of course not stop thefragmentation, but keep the problem at a much lower level.


Regards,

Klaus

Follow-Ups:
- Re: mdb fragmentation
  - From: Howard Chu <hyc@symas.com>

References:
- Re: mdb fragmentation
  - From: Quanah Gibson-Mount <quanah@symas.com>

Prev by Date: Re: Using virtual IP and N-way mutlimaster mode
Next by Date: Re: mdb fragmentation
Index(es):
- Chronological
- Thread