[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Issues arising from creating powerdns backend based on LMDB

To: Howard Chu <hyc@symas.com>
Subject: Re: Issues arising from creating powerdns backend based on LMDB
From: Mark Zealey <spam@markandruth.co.uk>
Date: Thu, 22 Aug 2013 23:58:31 +0300
Cc: openldap-technical@openldap.org
In-reply-to: <52167683.1060807@symas.com>
References: <52165B85.8060802@markandruth.co.uk> <52165F7A.7040901@markandruth.co.uk> <52167683.1060807@symas.com>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6

On 22/08/13 23:37, Howard Chu wrote:

1) Can you update documentation to explain what happens when I do a
mdb_cursor_del() ? I am assuming it advances the cursor to the next
record (this seems to be the behaviour). However there is some sort of
bug with this assumption. Basically I have a loop which jumps
(MDB_SET_RANGE) to a key and then wants to do a delete until key is like
something else. So I do while(..) { mdb_cursor_del(),
mdb_cursor_get(..., MDB_GET_CURRENT)}. This works fine mostly, but
roughly 1% of the time I get EINVAL returned when I try to
MDB_GET_CURRENT after a delete. This always seems to happen on the same
records - not sure about the memory structure but could it be something
to do with hitting a page boundary somehow invalidating the cursor?


That's exactly what it does, yes.


Any idea about the EINVAL issue?

None of the memory behavior you just described makes any sense to me.LMDB uses a shared memory map, exclusively. All of the memory growthyou see in the process should be shared memory. If it's anywhere elsethen I'm pretty sure you have a memory leak. With all the valgrindsessions we've run I'm also pretty sure that *we* don't have a memoryleak.
As for the random I/O, it also seems a bit suspect. Are you doing acommit on every key, or batching multiple keys per commit?


I'm not doing *any* commits just one big txn for all the data...

The below C works fine up until i=4m (ie 500mb of residential memoryshown in top), then has massive slowdown, shared memory (again, as seenin top) increases, waits about 20-30 seconds and then disks get hammeredwriting 10mb/sec (200txns) when they are capable of 100-200mb/secstreaming writes... Does it do the same for you?


int main(int argc,char * argv[]) {
    int i = 0, j = 0, rc;

MDB_env *env; MDB_dbi dbi; MDB_val key, data; MDB_txn *txn; charbuf[40];

    int count = 100000000;

        rc = mdb_env_create(&env);
        rc = mdb_env_set_mapsize(env, (size_t)1024*1024*1024*10);
        rc = mdb_env_open(env, "./testdb", 0, 0664);
        rc = mdb_txn_begin(env, NULL, 0, &txn);
        rc = mdb_open(txn, NULL, 0, &dbi);

        for (i=0;i<count;i++) {

sprintf( buf, "blah foo %9d%9d%9d", (long)(random() *(float)count / RAND_MAX) - i, i, i );

            if( i %100000 == 0 )
                printf("%s\n", buf);
            key.mv_size = sizeof(buf); key.mv_data = &buf;
            data.mv_size = sizeof(buf); data.mv_data = &buf;
            rc = mdb_put(txn, dbi, &key, &data, 0);
        }
        rc = mdb_txn_commit(txn);
        mdb_close(env, dbi);

        mdb_env_close(env);

    return 0;
}

By the way, I've just generated our biggest database (~4.5gb) fromscratch using our standard perl script. Using kyoto (treedb) withvarious tunings it did it in 18 min real time vs lmdb at 50 minutes(both ssd-backed in a box with 24gb free memory).


Mark

Follow-Ups:
- Re: Issues arising from creating powerdns backend based on LMDB
  - From: Howard Chu <hyc@symas.com>

References:
- Issues arising from creating powerdns backend based on LMDB
  - From: Mark Zealey <spam@markandruth.co.uk>
- Re: Issues arising from creating powerdns backend based on LMDB
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: Issues arising from creating powerdns backend based on LMDB
Next by Date: Re: Issues arising from creating powerdns backend based on LMDB
Index(es):
- Chronological
- Thread