[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#7667) performance degradation when using MDB_INTEGERKEY



romange@gmail.com wrote:
> --001a11c1e98008372804e46726c2
> Content-Type: text/plain; charset=UTF-8
>
> Hi, I extracted a small dataset that shows the problem.
> you can download it from here:
> https://docs.google.com/file/d/0B6o29pwkWoERdnFSaUtMNDljemc/edit?usp=sharing
>
> I modified mdb_copy.c to demonstrate the difference. copy it to source dir
> from here
> https://docs.google.com/file/d/0B6o29pwkWoERd3VuUm1DN0FpcUU/edit?usp=sharing
>
> build and
> run "time ./mdb_copy foo foo2"
> after this change the flag at line 64 and run it again.
> at my computer the difference is 17s vs 1.7s for 3 million items.

This test doesn't prove the existence of a bug. You're running on a 
Little-Endian machine, therefore data that is in sorted order as a string is 
in hashed order when used as an integer. Your data insert turns into a 
worst-case insert order in this case, causing the worst possible random access 
strides through memory. Assuming the two orders to be equivalent is a pretty 
common mistake for DB programmers. Microsoft has done the same thing in 
ActiveDirectory, I mentioned it here a few years ago 
http://www.openldap.org/lists/openldap-devel/200711/msg00002.html

If you had run this test on a Big-Endian machine, like SPARC, the insert order 
would be identical either way, and INTEGERKEY result would have been faster.

Closing this ITS, no bug.
>
>
>
> On Tue, Aug 20, 2013 at 9:04 PM, Quanah Gibson-Mount <quanah@zimbra.com>wrote:
>
>> --On Sunday, August 18, 2013 11:46 AM +0000 romange@gmail.com wrote:
>>
>>   Full_Name: Roman Gershman
>>> Version:
>>> OS: linux 3.8.0-25-generic
>>> URL:
>>> Submission from: (NULL) (212.150.97.210)
>>>
>>
>> Please provide further information, specifically:
>>
>> The size of values
>> Insert order
>> Sample code if possible
>>
>> Thanks,
>> Quanah
>>
>>
>> --
>>
>> Quanah Gibson-Mount
>> Lead Engineer
>> Zimbra, Inc
>> --------------------
>> Zimbra ::  the leader in open source messaging and collaboration
>>
>
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/