[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#9099) LMDB: range delete



opensource@gmx-topmail.de wrote:
> Full_Name: Markus
> Version: 
> OS: 
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (91.64.106.220)
> 
> 
> We built a database with LMDB and are very pleased with it in general. The one
> area giving our users headaches regularly are bulk deletes. But some context
> first:
> 
> Like many others, we use a hierarchical key notation. Some simplified examples:
> "user/00001", "user/00002", "user/00003", ...; "usergroup/00001",
> "usergroup/00002", ....
> 
> Or, let's say data by timestamp: "sensordata/32489732", "sensordata/32489733",
> ...
> 
> Use case like "remove all users" or "remove all sensordata entries from time x
> to y" are usual. Those (transactional) operations call mdb_cursor_del() on each
> entry. This runs fine until we have LOTS of entries to delete. At a magnitude of
> 100,000s of entries, two things happen: 1) the time it takes to delete gets
> quite noticeable and 2) the 32 bit (MDB_VL32) will run into MDB_TXN_FULL (see
> also #8813).
> 
> Given the key structure, those operations operate on consecutive keys. Because
> of that, can we make this more efficient (and get rid MDB_TXN_FULL on the fly)?
> 
> A high number of individual deletes involves lots of page merging (and a lot of
> dirty pages). If we had a specialized "range delete" this could likely be
> avoided. The signature of the proposed functionality could look like this:

The number of pages dirtied will be the same, whether you batch or delete one by one.
As such, this won't solve the MDB_TXN_FULL problem. Also, as of commit 7edf504106c61639a89b9a4e5987242598196932
this problem should have already been removed for MDB_VL32.

> int mdb_cursor_del_range(MDB_cursor *mc, MDB_val *key_start, MDB_val *key_stop,
> unsigned int flags)
> 
> Given key_start and key_stop, this new function could decide to prune entire
> leaf pages, or even entire sub branches. This should result in a significant
> efficiency/performance gain. The challenge I'm seeing here is balancing the tree
> though.

Right, rebalancing would become quite a pain. A similar problem exists for
PUT_MULTIPLE which is why it just internally loops over one item at a time.

> What do you think? I'm not sure if OpenLdap would profit here, but I'm quite
> certain that LMDB as a general database would. I'm also open to contribute if
> you are willing to give some guidance.

-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/