[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#9099) LMDB: range delete



Full_Name: Markus
Version: 
OS: 
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (91.64.106.220)


We built a database with LMDB and are very pleased with it in general. The one
area giving our users headaches regularly are bulk deletes. But some context
first:

Like many others, we use a hierarchical key notation. Some simplified examples:
"user/00001", "user/00002", "user/00003", ...; "usergroup/00001",
"usergroup/00002", ....

Or, let's say data by timestamp: "sensordata/32489732", "sensordata/32489733",
...

Use case like "remove all users" or "remove all sensordata entries from time x
to y" are usual. Those (transactional) operations call mdb_cursor_del() on each
entry. This runs fine until we have LOTS of entries to delete. At a magnitude of
100,000s of entries, two things happen: 1) the time it takes to delete gets
quite noticeable and 2) the 32 bit (MDB_VL32) will run into MDB_TXN_FULL (see
also #8813).

Given the key structure, those operations operate on consecutive keys. Because
of that, can we make this more efficient (and get rid MDB_TXN_FULL on the fly)?

A high number of individual deletes involves lots of page merging (and a lot of
dirty pages). If we had a specialized "range delete" this could likely be
avoided. The signature of the proposed functionality could look like this:

int mdb_cursor_del_range(MDB_cursor *mc, MDB_val *key_start, MDB_val *key_stop,
unsigned int flags)

Given key_start and key_stop, this new function could decide to prune entire
leaf pages, or even entire sub branches. This should result in a significant
efficiency/performance gain. The challenge I'm seeing here is balancing the tree
though.

What do you think? I'm not sure if OpenLdap would profit here, but I'm quite
certain that LMDB as a general database would. I'm also open to contribute if
you are willing to give some guidance.

Thanks for your consideration,
Markus