[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: py-lmdb

To: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Subject: Re: py-lmdb
From: Howard Chu <hyc@symas.com>
Date: Sun, 18 May 2014 12:15:45 -0700
Cc: openldap-devel@openldap.org
In-reply-to: <CAPweEDw0ThMQULkAuocVxKfaEAPtaPxwfmRrto1sHJNjemGHsA@mail.gmail.com>
References: <CAPweEDxDKstPXSHNBgzwKKHfaHEkEHy53Aa=HEt5NGtg0Wg7VQ@mail.gmail.com> <5378A65B.10406@symas.com> <CAPweEDyyF-ROT5kKabMhm9zArFQf+5Pb1+_u2Z3P0_1fVXUHug@mail.gmail.com> <5378F94E.8010601@symas.com> <CAPweEDw0ThMQULkAuocVxKfaEAPtaPxwfmRrto1sHJNjemGHsA@mail.gmail.com>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 Firefox/29.0 SeaMonkey/2.26a1

Luke Kenneth Casson Leighton wrote:

We fell for the fantasy of parallel writes with BerkeleyDB, but after a
dozen+ years of poking, profiling, and benchmarking, it all becomes clear -
all of that locking overhead+deadlock detection/recovery is just a waste of
resources.


  ... which is why tdb went to the other extreme, to show it could be done.

But even tdb only allows one write transaction at a time. I looked intowriting a back-tdb for OpenLDAP back in 2009, before I started writing LMDB. Iknow pretty well how tdb works...

https://twitter.com/hyc_symas/status/451763166985613312


quote:

"The new code is faster at indexing and searching, but not so much
faster it would blow you away, even using
LMDB. Turns out the slowness of Python looping trumps the speed of a
fast datastore :(. The difference
might be bigger on a big index; I'm going to run experiments on the
Enron dataset and see."

interesting.  so why is read up at 5,000,000 per second under python
(in a python loop, obviously) but write isn't?  something odd there.

Good question. I'd guess there's some memory allocation overhead involved inwrites. The Whoosh guys have some more perf stats here


https://bitbucket.org/mchaput/whoosh/wiki/Whoosh3

(their test.Tokyo / All Keys result is highly suspect though, the timing isthe same for 100,000 keys as for 1M keys. Probably a bug in their test code.)


--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Follow-Ups:
- Re: py-lmdb
  - From: Volker Lendecke <Volker.Lendecke@SerNet.DE>

References:
- py-lmdb
  - From: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
- Re: py-lmdb
  - From: Howard Chu <hyc@symas.com>
- Re: py-lmdb
  - From: Howard Chu <hyc@symas.com>

Prev by Date: Re: py-lmdb
Next by Date: Re: py-lmdb
Index(es):
- Chronological
- Thread