[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: dissection of search latency



Before adding an entry cache, I'd like to see numbers comparing
the backends with fully indexed searches.  Of course, that means
someone needs to finish implementing the indexing.  I'm hoping
to get to it later this month but only after the referral code
is finished off.  (The referral code is needed for 2.1.)  Others
are welcomed to jump on in.

Kurt

At 11:31 AM 2001-11-15, Jonghyuk Choi wrote:

>Yes. your assumptions are correct.
>Both of them use the same Sleepycat BDB 3.3 and use btree.
>The test system is Pentium III 1GHz (1CPU)  with 512MB of RAM,
>and there is no swapping during test.
>
>When back-bdb have entry caches, search performance will be of the same
>order
>as back-ldbm, even better if caches are made more efficient,
>because search operations do not have to be transaction protected.
>
>- Jong
>
>------------------------
>Jong Hyuk Choi
>jongchoi@us.ibm.com
>IBM Thomas J. Watson Research Center
>Enterprise Linux Group
>
>
>"Howard Chu" <hyc@highlandsun.com>@OpenLDAP.org on 2001-11-14 06:35:55 PM
>
>Sent by:  owner-openldap-devel@OpenLDAP.org
>
>
>To:   "Lawrence Greenfield" <leg+@andrew.cmu.edu>,
>      <openldap-devel@OpenLDAP.org>
>cc:
>Subject:  RE: dissection of search latency
>
>
>
>
>> -----Original Message-----
>> From: owner-openldap-devel@OpenLDAP.org
>> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Lawrence
>> Greenfield
>
>> A search with no indexing isn't a very interesting benchmark; this
>> isn't what we should be optimizing for.
>
>Unfortunately back-bdb's indexing support is still incomplete so doing it
>any other way wouldn't make a very interesting benchmark either. Besides,
>slowdowns revealed here will still be relatively slow in the indexed case.
>But yes, it will also be necessary to analyze the timing for indexed
>searches using some large number of non-sequential queries.
>>
>> Berkeley db with txns is fairly slow at iterations, though 0.5 seconds
>> to iterate across 10000 entries seems pretty bad.  Is there any
>> locking going on besides the internal locks in Berkeley db?  (On the
>> plus side, the Berkeley db code will allow multiple iterators
>> simultaneously.)
>
>No, back-bdb does no locking of its own, everything is left up to Berkeley
>db. Makes me wonder about the approach though; I don't see back-bdb being
>viable as a general-purpose backend with such a high performance cost. Yes,
>some people will sleep better at night knowing they can recover from
>catastrophic disk failure, but those data-paranoid people also run
>redundant
>hardware already, and don't need to be coddled by overprotective software.
>
>> Also, Berkeley db does internal caching; no I/O goes on the second
>> time around anyway.
>
>True... However, it's probably still better to keep data in the slapd
>internal format than in the disk format, especially for frequently
>re-accessed entries.
>
>  -- Howard Chu
>  Chief Architect, Symas Corp.       Director, Highland Sun
>  http://www.symas.com               http://highlandsun.com/hyc
>  Symas: Premier OpenSource Development and Support
>
>---------------------------------------------------------------------------
>
>Fascinating. Assuming that both backends were built using the same version
>of Berkeley DB, I'd guess the slowdown in back-bdb is due to transaction
>management. Except, the search code is not transaction-protected, so it's
>hard to say what the real issue is. I believe in both cases Berkeley DB is
>used with the Btree access method, so again the difference in size and
>access times is rather odd.
>
>Out of curiosity, how much RAM was available on the system during these
>tests? I'll assume that swapping activity was zero at all times?
>
>When the back-bdb database was built, were all of the log files committed
>and then removed? I'm not sure it has any relevance to runtime performance,
>but obviously it consumes disk space. (I don't recall if ext2fs cares, but
>Berkeley Fast Filesystem performance always degraded if the usage went
>above
>90%.)
>
>A couple of the times are actually slower in the warm case, and it may not
>just be a measurement anomaly since it appears for both backends. I wonder
>why that is.
>
>  -- Howard Chu
>  Chief Architect, Symas Corp.       Director, Highland Sun
>  http://www.symas.com               http://highlandsun.com/hyc
>  Symas: Premier OpenSource Development and Support