[Date Prev][Date Next]
Re: LMDB: Delete data after MDB_MAP_FULL
- To: email@example.com, firstname.lastname@example.org
- Subject: Re: LMDB: Delete data after MDB_MAP_FULL
- From: Howard Chu <email@example.com>
- Date: Mon, 26 Feb 2018 16:34:47 +0000
- In-reply-to: <trinity-e91f4241-54e0-49db-9e92-e6a0cbbe09ec-1519640898575@3c-app-gmx-bs73>
- References: <trinity-e91f4241-54e0-49db-9e92-e6a0cbbe09ec-1519640898575@3c-app-gmx-bs73>
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0 SeaMonkey/2.53a1
In an app with quite a few installs the LMDB data file grew to our maximum
size of 1.5 GB. This was due to a programming error in the app. Because this
affects a significant amount of users the general question is how to resolve
the issue and delete the superfluous data. Obvious problem is that in LMDB you
cannot delete data without growing the data file.
That's not entirely true; if you already have free pages available, it is
possible to delete data without using any new pages at all.
Straight forward approach is to increase the max file size
(mdb_env_set_mapsize) to let's say 2 GB and delete the superfluous data.
However because this a mobile app, it is not always granted that additional
disk space is available on all devices.
That's true regardless of whether you're on a mobile device.
1) Would an additional 0.5 GB be relatively safe to have enough space to
delete around 1.2 GB of data (around 10M K/V pairs)? As far as I understood,
the extra space is required to build up a new tree with creating new nodes on
the fly. Because most of the K/V pairs will be deleted in that process, I
would expect the tree to build a completely new set of branch pages but still
using the existing leaf pages (data values). Is that correct?
Not quite. If you're deleting one key at a time, then a new leaf page still
needs to be created, even if it will eventually be emptied. However, once the
page is emptied it can be reused for whatever page is needed next.
2) Would it make any sense to delete in superfluous data in multiple
transactions (let's say 10)? Smaller change sets should grow the file size
less, but it would only worth the extra effort if the size saving are significant.
Yes, multiple smaller transactions means the free pages will be recycled and
reusable more quickly, so the overall file growth should be minimal.
3) Any additional thoughts or tricks that come to your mind in this scenario?
If you're deleting a lot of keys in sequence, it will be slightly faster to
delete them from the tail forward.
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/