Full_Name: Jong-Hyuk Choi Version: HEAD OS: Linux URL: ftp://ftp.openldap.org/incoming/bdb-scalability-patch-jongchoi-050708.diff Submission from: (NULL) (129.34.20.23) This ITS issue provides a scalability patch for Berkeley DB which contains a buddy memory allocator for the Berkeley DB memory pool. While I have been investigating the scalability of OpenLDAP and developing scalability enhancing techniques such as the index clustering (ITS#3611), it was found that those techniques become almost ineffective as soon as the working set grows larger than the DB cache size. Through profiling it was identified that the Berkeley DB's shared memory allocator, __db_shalloc(), is consuming most of the CPU cycles in finding a free memory chunk which can accomodate the requested size. __db_shalloc() uses a simple first-fit linear algorithm which can obviously cause a serious performance degradation once there is a high degree of external fragmentation. The new memory allocator contained in this patch, __db_smalloc(), implements a buddy memory allocator which provides O(1) behavior with regard to the number of objects in a region while preserving the same semantics as in the original __db_shalloc() such as the persistency of allocation, ability to remap to arbitrary virtual addresses, and due support for aligning and region sizes. The new memory allocator can coexist with the original one. In the current patch, only the memory pool subsystem is set to use the new buddy allocator. The rest including the logging and the locking subsystems still use the original linear allocator. They can also be configured to use the new one by setting a flag as such in the corresponding region information data structure. The target of this patch is the Berkeley DB version 4.2.52 without encryption support. Applicability to other versions of Berkeley DB is not tested. The following graph shows the performance of directory population (slapadd). x-axis is the number of directory entries added and y-axes is the time taken to perform directory population in seconds. System Under Test Server: IBM eServer x335 with 8 2.8GHz Xeons and 12GB of memory Disk: 3 IBM ServeRaid Disk Arrays (RAID 5, 15K RPM Disks) database environment, log files, and ldif file, each sits on separate disk arrays OS: SLES9 DB Cache Size: 1.68GB DIT: DirectoryMark generated; 1 organization, 10 organizationalunit, and the rest organizationalPerson Indexing: equality for objectClasss, equality and substring for cn (Similar performance results could be obtained with smaller scale servers because slapadd is single threaded and the SUT is a 32-bit platform.) ftp://ftp.openldap.org/incoming/slapadd-perf-choi-050708.png (full scale) ftp://ftp.openldap.org/incoming/slapadd-perf-zi-choi-050708.png (zoomed in) The first bar in each group represents the performance of the current HEAD and the second bar represents that of the current HEAD + Index Clustering Patch (ITS#3611) both without the Berkeley DB scalability patch. The third and forth bar represent those with the Berkeley DB scalability patch. First compare the first and the second bars in each group. As shown in the graph, the index clustering becomes less effective beyond 1 million directory entries. The directory population time increases linearly up to 1 million entries but increases more than quadratically from that point on. While the index clustering is able to achieve close to 70% speedup with 1 million entries, speedup is 16% at most with 4 million entries. With the Berkeley DB scalability patch, the directory population time grows almost linearly and the index clustering patch remains effective up to 8 million entries. The speedups achieved by the Berkeley DB scalability patch with 4 million entries are 6.49 and 9.12 with and without the index clustering, respectively. The overall speedup achieved by applying both the Berkeley DB scalability patch and the index clustering is 10.52. I'm currently gathering more performance data with different configurations and with larger ldif files. I will post further results as follow ups to this ITS issue. - Jong-Hyuk ------------------------ Jong Hyuk Choi IBM Thomas J. Watson Research Center - Enterprise Linux Group P.O. Box 218, Yorktown Heights, NY 10598 email: jongchoi@us.ibm.com (phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979
moved from Incoming to Development
Hi Jong, I took a look at the other effects this patch has, i.e., what happens when you put OpenLDAP in a mixed RW situation with a version of BDB compiled with your patches in place. The results are devastating. Without the patch, using a 29% Write, 71% Read scenario: 630.304 operations/second 182.671 modifies/second 447.632 searches/second With the patch, using a 29% Write, 71% Read scenario: 53.707 operations/second 15.554 modifies/second 38.153 searches/second As you can see, the performance is more than 10 times worse with the patch in place than without it. This was using the OpenLDAP 2.3.4 release on a 2 CPU SunFire v210 with 2 GB of RAM, running Solaris 8. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Any profiling data or clues on where that degradation comes from ? How many entries did you add for this experimental setup ? - Jong-Hyuk quanah@symas.com wrote: >Hi Jong, > >I took a look at the other effects this patch has, i.e., what happens when >you put OpenLDAP in a mixed RW situation with a version of BDB compiled >with your patches in place. The results are devastating. > >Without the patch, using a 29% Write, 71% Read scenario: > >630.304 operations/second >182.671 modifies/second >447.632 searches/second > > >With the patch, using a 29% Write, 71% Read scenario: > >53.707 operations/second >15.554 modifies/second >38.153 searches/second > > >As you can see, the performance is more than 10 times worse with the patch >in place than without it. > >This was using the OpenLDAP 2.3.4 release on a 2 CPU SunFire v210 with 2 GB >of RAM, running Solaris 8. > >--Quanah > >-- >Quanah Gibson-Mount >Product Engineer >Symas Corporation >Packaged, certified, and supported LDAP solutions powered by OpenLDAP: ><http://www.symas.com> > > > > > > >
--On Wednesday, July 13, 2005 7:15 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > Any profiling data or clues on where that degradation comes from ? I haven't done so at this time. However, the only difference was using a version of BDB with your patch, and a version without it. > How many entries did you add for this experimental setup ? None. This was done an an existing 10k database, seeing how it performed in a mixed read/write scenario, where 29% of the operations where MODs and 71% were equality searches. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
> > > --On Wednesday, July 13, 2005 7:15 AM -0400 Jong-Hyuk > <jongchoi@watson.ibm.com> wrote: > >> Any profiling data or clues on where that degradation comes from ? > > > I haven't done so at this time. However, the only difference was > using a version of BDB with your patch, and a version without it. Can you confirm that you applied the BDB scalability patch without the index clustering patch ? > >> How many entries did you add for this experimental setup ? > > > None. This was done an an existing 10k database, seeing how it > performed in a mixed read/write scenario, where 29% of the operations > where MODs and 71% were equality searches. > --Quanah > > -- > Quanah Gibson-Mount > Product Engineer > Symas Corporation > Packaged, certified, and supported LDAP solutions powered by OpenLDAP: > <http://www.symas.com> > > > >
--On Wednesday, July 13, 2005 12:45 PM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > >> >> >> --On Wednesday, July 13, 2005 7:15 AM -0400 Jong-Hyuk >> <jongchoi@watson.ibm.com> wrote: >> >>> Any profiling data or clues on where that degradation comes from ? >> >> >> I haven't done so at this time. However, the only difference was >> using a version of BDB with your patch, and a version without it. > > > Can you confirm that you applied the BDB scalability patch without the > index clustering patch ? Since it is impossible to apply the index clustering patch to OpenLDAP at this time, I can quite happily confirm that I didn't apply it. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Quanah, Here are more performance evaluation results of the Berkeley DB scalability patch. My results revealed no sign of performance degradation as reported in your result and I couldn't find any good reason why you had have such a devastation. Here they go: ==================================================================================== SUT: IBM eServer x445 with 8 cpu (only 4 cpus were used for this experiment) BDB Cache size: 1.6GB OpenLDAP caches are set to minimum to observe the BDB cache effect only (both entry cache and idl cache were set to 100 in size) indexing: equality indexing on uid ==================================================================================== Performance under no memory pressure (DirectoryMark messging scenario with a 512k entry DIT) ------------------------------------------------------------------------------------ With the Scalability Patch population: 4m18.856s warming up: 54.442s (search all entries locally) messaging search (remote) from 8 clients for 5 mins: 11442 ops/sec messaging search (remote) + 2 ldapadds (remote): search: 6293 ops/sec; add: 657.43 ops/sec ----------------------------------------------------------------------------------- Without the Scalability Patch population: 4m18.317s warming up: 52.361s (search all entries locally) messaging search (remote) from 8 clients for 5 mins: 11632 ops/sec messaging search (remote) + 2 ldapadds (remote): search: 6416 ops/sec; add: 655.16 ops/sec ==================================================================================== Performance under memory pressure (DirectoryMark messaging scenario with a 4m entry DIT) ------------------------------------------------------------------------------------ With the Scalability Patch population: 38m23.576s warming up: 13m30.087s (search all entries locally) messaging search (remote) from 8 clients for 5 mins: 5420 ops/sec messaging search (remote) + 2 ldapadds (remote): search: 4421 ops/sec; add: 342.63 ops/sec ----------------------------------------------------------------------------------- Without the Scalability Patch population: 77m14.390s warming up: 85m55.154s (search all entries locally) messaging search (remote) from 8 clients for 5 mins: 268 ops/sec messaging search (remote) + 2 ldapadds (remote): search: 166 ops/sec; add: 18.97 ops/sec ==================================================================================== As you can see in the above results, virtually no performance degradation was observed in the no memory pressure condition. With memory pressure, it is shown that the Berkeley DB scalability patch significantly improves the scalability of both pure searches and mixed operations (searches and adds). - Jong-Hyuk quanah@symas.com wrote: >Hi Jong, > >I took a look at the other effects this patch has, i.e., what happens when >you put OpenLDAP in a mixed RW situation with a version of BDB compiled >with your patches in place. The results are devastating. > >Without the patch, using a 29% Write, 71% Read scenario: > >630.304 operations/second >182.671 modifies/second >447.632 searches/second > > >With the patch, using a 29% Write, 71% Read scenario: > >53.707 operations/second >15.554 modifies/second >38.153 searches/second > > >As you can see, the performance is more than 10 times worse with the patch >in place than without it. > >This was using the OpenLDAP 2.3.4 release on a 2 CPU SunFire v210 with 2 GB >of RAM, running Solaris 8. > >--Quanah > >-- >Quanah Gibson-Mount >Product Engineer >Symas Corporation >Packaged, certified, and supported LDAP solutions powered by OpenLDAP: ><http://www.symas.com> > > > > > > >
Jong, I haven't had time to try this patch myself yet, but I was wondering if you've submitted this patch to Sleepycat and gotten any feedback from them? It would probably be enlightening to get their analysis. jongchoi@watson.ibm.com wrote: > Quanah, > > Here are more performance evaluation results of the Berkeley DB > scalability patch. My results revealed no sign of performance > degradation as reported in your result and I couldn't find any good > reason why you had have such a devastation. > -- -- Howard Chu Chief Architect, Symas Corp. Director, Highland Sun http://www.symas.com http://highlandsun.com/hyc Symas: Premier OpenSource Development and Support
--On Friday, July 15, 2005 1:26 PM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > Quanah, > > Here are more performance evaluation results of the Berkeley DB > scalability patch. My results revealed no sign of performance degradation > as reported in your result and I couldn't find any good reason why you > had have such a devastation. > > As you can see in the above results, virtually no performance degradation > was observed in the no memory pressure condition. With memory pressure, > it is shown that the Berkeley DB scalability patch significantly improves > the scalability of both pure searches and mixed operations (searches and > adds). Hi Jong, I'm testing again. Looking back into the tests messages, I see that slapd actually core dumped, which is why the results dropped so low. I've not had the stock 2.3.4 with BDB 4.2.52+patches core dump on me any time previously, so this would be a new behavior. I've reloaded the database from scratch, and am re-testing. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Friday, July 15, 2005 6:41 PM -0700 Quanah Gibson-Mount <quanah@symas.com> wrote: > > > --On Friday, July 15, 2005 1:26 PM -0400 Jong-Hyuk > <jongchoi@watson.ibm.com> wrote: > >> Quanah, >> >> Here are more performance evaluation results of the Berkeley DB >> scalability patch. My results revealed no sign of performance degradation >> as reported in your result and I couldn't find any good reason why you >> had have such a devastation. >> >> As you can see in the above results, virtually no performance degradation >> was observed in the no memory pressure condition. With memory pressure, >> it is shown that the Berkeley DB scalability patch significantly improves >> the scalability of both pure searches and mixed operations (searches and >> adds). > > Hi Jong, > > I'm testing again. Looking back into the tests messages, I see that > slapd actually core dumped, which is why the results dropped so low. > I've not had the stock 2.3.4 with BDB 4.2.52+patches core dump on me any > time previously, so this would be a new behavior. > > I've reloaded the database from scratch, and am re-testing. Hi Jong, The second test looks good. Without patch: 630.304 ops/second 182.671 mod/second 447.632 searches/second With patch: 632.156 ops/second 183.216 mod/second 448.941 searches/second So a definite (though slight in my small database case) improvement. Yeah. :) --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Friday, July 15, 2005 6:57 PM -0700 Quanah Gibson-Mount <quanah@symas.com> wrote: > Hi Jong, > > The second test looks good. > > Without patch: > 630.304 ops/second > 182.671 mod/second > 447.632 searches/second > > With patch: > 632.156 ops/second > 183.216 mod/second > 448.941 searches/second > > > So a definite (though slight in my small database case) improvement. > Yeah. :) Hi Jong, I'm curious if you have tested this patch on DB's with a large number of indices. I put together a 100K entry DB (I unfortunately do not have 12GB RAM machines at my disposal) with 21 indices in it: index default eq index objectClass index uid index departmentNumber index cn eq,sub,approx index sn eq,sub,approx index l index ou index telephonenumber eq,sub index userpassword index givenName eq,sub,approx index mail index carLicense index employeeType index homephone eq,sub index mobile eq,sub index pager eq,sub index o index roomNumber index preferredLanguage index postalCode index st I then tested the difference in load times with and without your DB scalability patch. The effect I saw of having a large set of indices is that the DB scalability patch does not perform as well as the normal BDB process. Results w/o patch: ------------------------------------- 538.67u 7.76s 9:13.34 98.7% 537.08u 7.69s 9:12.05 98.6% 535.90u 8.17s 9:10.99 98.7% Results w/ patch: ------------------------------------- 726.48u 13.03s 14:41.07 83.9% 727.89u 12.04s 14:37.65 84.3% 727.35u 12.06s 14:39.21 84.0% One thing immediately evident is that the CPU usage drops a good 14-15% than without the patch, with a corresponding 1/3 increase in total time to load the database. Have you done any testing of your patch on large scale DB's with a good number of indices? Regards, Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Quanah, thanks for your inputs with experimental data. I believe that you need more data points to draw a meaningful comparison, though. Especially, I'd like to see 1) what the result for larger DITs, say ones having 1 ~ 4 million entries, would be; and 2) the BDB cache sizes and actual RSS are needed for each of these experiments. I'm afraid that you are looking at only one point in the whole scalability picture. Actually, I was busy doing baseline scalability evaluation so far. I was able to add up to 64 million entries within a reasonable amount of time (currently doing the same with 128 million). - Jong-Hyuk Quanah Gibson-Mount wrote: > > > --On Friday, July 15, 2005 6:57 PM -0700 Quanah Gibson-Mount > <quanah@symas.com> wrote: > >> Hi Jong, >> >> The second test looks good. >> >> Without patch: >> 630.304 ops/second >> 182.671 mod/second >> 447.632 searches/second >> >> With patch: >> 632.156 ops/second >> 183.216 mod/second >> 448.941 searches/second >> >> >> So a definite (though slight in my small database case) improvement. >> Yeah. :) > > > Hi Jong, > > I'm curious if you have tested this patch on DB's with a large number > of indices. I put together a 100K entry DB (I unfortunately do not > have 12GB RAM machines at my disposal) with 21 indices in it: > > index default eq > index objectClass > index uid > index departmentNumber > index cn eq,sub,approx > index sn eq,sub,approx > index l > index ou > index telephonenumber eq,sub > index userpassword > index givenName eq,sub,approx > index mail > index carLicense > index employeeType > index homephone eq,sub > index mobile eq,sub > index pager eq,sub > index o > index roomNumber > index preferredLanguage > index postalCode > index st > > > I then tested the difference in load times with and without your DB > scalability patch. The effect I saw of having a large set of indices > is that the DB scalability patch does not perform as well as the > normal BDB process. > > Results w/o patch: > ------------------------------------- > 538.67u 7.76s 9:13.34 98.7% > 537.08u 7.69s 9:12.05 98.6% > 535.90u 8.17s 9:10.99 98.7% > > > Results w/ patch: > ------------------------------------- > 726.48u 13.03s 14:41.07 83.9% > 727.89u 12.04s 14:37.65 84.3% > 727.35u 12.06s 14:39.21 84.0% > > > One thing immediately evident is that the CPU usage drops a good > 14-15% than without the patch, with a corresponding 1/3 increase in > total time to load the database. > > Have you done any testing of your patch on large scale DB's with a > good number of indices? > > Regards, > Quanah > > -- > Quanah Gibson-Mount > Product Engineer > Symas Corporation > Packaged, certified, and supported LDAP solutions powered by OpenLDAP: > <http://www.symas.com> > > > >
--On Thursday, July 28, 2005 9:02 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > Quanah, thanks for your inputs with experimental data. I believe that you > need more data points to draw a meaningful comparison, though. > Especially, I'd like to see 1) what the result for larger DITs, say ones > having 1 ~ 4 million entries, would be; and 2) the BDB cache sizes and > actual RSS are needed for each of these experiments. I'm afraid that you > are looking at only one point in the whole scalability picture. > > Actually, I was busy doing baseline scalability evaluation so far. I was > able to add up to 64 million entries within a reasonable amount of time > (currently doing the same with 128 million). Hi Jong, I don't think you read my email very closely. 1) I noted I don't have systems with large amounts of RAM to test large databases on. 2) I didn't say this was a meaningful comparison. I said this is behavior I noticed when using a large set of indices. I am quite aware I'm only looking at one small data point. But what is significant to me about that data point, and what you said in your previous emails about this patch, is that you are not using a significant number of indices. You are only using two (one objectclass eq, one cn eq,sub). What I saw in my 100k test is when I went from 3 indices to 21, is that the scalability patch begins to suffer. Which is then why I asked, at the end of my email: >> Have you done any testing of your patch on large scale DB's with a >> good number of indices? That is the question I'm interested in having an answer to. Its great if the patch holds up if you do 500 billion entry databases. If it can't function when you have more than some X number of indices, where X is rather small, then its usefulness becomes suspect. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
> Hi Jong, > > I don't think you read my email very closely. > > 1) I noted I don't have systems with large amounts of RAM to test > large databases on. Although I'm testing the BDB scalability patch on a machine having 12GB RAM, the BDB cache size is set to only 1.6GB. It's a 32-bit platform. What is obvious between the lines of the performance results of the BDB scalability patch is that the patch makes it possible to add large databases in memory constrained systems which is typical in small to medium scale servers such as IBM BladeCenter and OpenPower. Hence, you should not consider not having a large amount of RAM as an inhibitor to large database experiments. > 2) I didn't say this was a meaningful comparison. I said this is > behavior I noticed when using a large set of indices. What I suspect is that the performance concern you reported is more related to the memory pressure caused by heavy indexing, less to the heavy indexing itself. If you perform more experiments with varying degree of indexing or varying sizes of DIT, it will become clear where the performance difference came from. > > I am quite aware I'm only looking at one small data point. But what > is significant to me about that data point, and what you said in your > previous emails about this patch, is that you are not using a > significant number of indices. You are only using two (one > objectclass eq, one cn eq,sub). What I saw in my 100k test is when I > went from 3 indices to 21, is that the scalability patch begins to > suffer. Which is then why I asked, at the end of my email: > >>> Have you done any testing of your patch on large scale DB's with a >>> good number of indices? >> > > That is the question I'm interested in having an answer to. Its great > if the patch holds up if you do 500 billion entry databases. If it > can't function when you have more than some X number of indices, where > X is rather small, then its usefulness becomes suspect. I don't agree with your last point. As I said earlier, with the data you gathered, you cannot tell whether the performance difference comes from indexing or not. Also, scalability means a lot more than just having high performance for every scaling point in the range. I would think it acceptable to spend 12 minutes instead of 9 in adding 100K entries if it is able to reduce the time for 4 million entries from 11 hours to just one. - Jong-Hyuk
--On Thursday, July 28, 2005 2:56 PM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: >> >> That is the question I'm interested in having an answer to. Its great >> if the patch holds up if you do 500 billion entry databases. If it >> can't function when you have more than some X number of indices, where >> X is rather small, then its usefulness becomes suspect. > > I don't agree with your last point. As I said earlier, with the data you > gathered, you cannot tell whether the performance difference comes from > indexing or not. Also, scalability means a lot more than just having high > performance for every scaling point in the range. I would think it > acceptable to spend 12 minutes instead of 9 in adding 100K entries if it > is able to reduce the time for 4 million entries from 11 hours to just > one. This isn't what I was saying. If it takes 15 minutes instead of 10 for 100K entries, and increases the time from 11 hours to 20 hours for 4 million entries if you use a lot of indices, then its usefulness becomes suspect. Which is why I'm curious if you have any data points using a large number of indices that indicates anything one way or the other. I can only assume from your responses that the answer is "no". --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
>I would think it acceptable to spend 12 minutes instead of 9 in adding >100K entries if it is able to reduce the time for 4 million entries >from 11 hours to just one. I'm not saying that the BDB scalability patch has shown such a behavior with small size DITs though. Without memory pressure, the linear allocator and the buddy allocator would perform almost the same. Because of the internal fragmentation, it is natural that the latter would experience memory pressure earlier than the former. After a small window, the former will also be met with the memory pressure and will eventually degenerate because of the external fragmentation.
> This isn't what I was saying. If it takes 15 minutes instead of 10 > for 100K entries, and increases the time from 11 hours to 20 hours for > 4 million entries if you This is usually not the case in many engineering problems including scalability. > use a lot of indices, then its usefulness becomes suspect. Which is > why I'm curious if you have any data points using a large number of > indices that indicates Well, if indexing is turned out to be a real suspect, it will have to be handled as a separate, orthogonal issue. > anything one way or the other. I can only assume from your responses > that the answer is "no". I don't have the data points with heavy data points as of now. - Jong-Hyuk
with heavy data points -> with heavy indexing
--On Thursday, July 28, 2005 3:26 PM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > >> This isn't what I was saying. If it takes 15 minutes instead of 10 >> for 100K entries, and increases the time from 11 hours to 20 hours for >> 4 million entries if you > > This is usually not the case in many engineering problems including > scalability. > >> use a lot of indices, then its usefulness becomes suspect. Which is >> why I'm curious if you have any data points using a large number of >> indices that indicates > > Well, if indexing is turned out to be a real suspect, it will have to be > handled as a separate, orthogonal issue. I would say that indexing is definitely a serious problem with your patch. I set up a 2 million entry LDIF file, and used a 1.68 GB cache. With only 3 indices, everything looked great: Time w/o patch: 1609.570u 36.590s 29:30.58 92.9% 0+0k 0+0io 460966pf+0w Time w/ patch: 749.850u 33.920s 16:03.15 81.3% 0+0k 0+0io 291633pf+0w So a nearly 2:1 performance GAIN. However, if you give it just a medium amount of indices (21), the performance is atrocious. It was so bad, I had to halt the slapadd. Time w/o patch (to get to 35%): [############# ] ( 34%) 01h15m 687344 obj [############# ] ( 34%) 01h16m 687952 obj [############# ] ( 34%) 01h16m 688256 obj [############# ] ( 34%) 01h17m 688864 obj [############# ] ( 34%) 01h17m 689168 obj [############# ] ( 34%) 01h17m 689472 obj [############# ] ( 34%) 01h17m 689776 obj [############# ] ( 34%) 01h17m 690080 obj [############# ] ( 34%) 01h17m 690384 obj [############# ] ( 34%) 01h17m 690688 obj [############# ] ( 34%) 01h17m 690992 obj [############# ] ( 34%) 01h17m 691296 obj [############# ] ( 34%) 01h18m 692208 obj [############# ] ( 34%) 01h18m 694640 obj [############# ] ( 34%) 01h18m 695856 obj [############# ] ( 34%) 01h18m 696464 obj [############# ] ( 34%) 01h18m 697072 obj [############# ] ( 34%) 01h19m 697680 obj [############# ] ( 34%) 01h19m 697984 obj [############# ] ( 34%) 01h19m 698896 obj [############# ] ( 34%) 01h19m 699200 obj [############# ] ( 34%) 01h19m 700416 obj [############# ] ( 34%) 01h19m 701024 obj [############## ] ( 35%) 01h19m 703152 obj [############## ] ( 35%) 01h19m 703760 obj [############## ] ( 35%) 01h20m 704368 obj Time w/ patch (to get to 35%): [############# ] ( 34%) 20h08m 691296 obj [############# ] ( 34%) 20h11m 691600 obj [############# ] ( 34%) 20h28m 693424 obj [############# ] ( 34%) 20h40m 694640 obj [############# ] ( 34%) 20h48m 695552 obj [############# ] ( 34%) 21h08m 697680 obj [############# ] ( 34%) 21h42m 701328 obj [############## ] ( 35%) 21h59m 703152 obj As you can see here, the performance LOSS is nearly 22:1 (22 hours to get where the unpatched version of BDB gets in 1). --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Here is a result with various degrees of indexing for 2 million entries : ftp://ftp.openldap.org/incoming/ScalabilityWithIndexing.png As shown in the graph, for 13 indexing (6 equality, 2 equality & substring, 5 equality, substring & approximate), it took almost a day (only less one hour) without the Berkeley DB scalability patch while it took only six hours with the patch. It would have been decreased further if a separate indexing were performed by using slapindex after the initial slapadd. Although you tested with 21 indexing, the number of substring/approximate indexing are the same at seven in both cases. The effect of additional equality indexing was found negligible through my performance evaluation. As a result, I would say my results contradict yours. With my performance evaluation configuration, I could not find any tendency that might give me a hint why you had experienced such a problem. One thing I noticed from your last result is that you did not run the experiment to completion. I think it is required to run each run to completion to draw a meaningful result. - Jong-Hyuk ------------------------ Jong Hyuk Choi IBM Thomas J. Watson Research Center - Enterprise Linux Group P. O. Box 218, Yorktown Heights, NY 10598 email: jongchoi@us.ibm.com (phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979
changed notes
changed notes changed state Open to Feedback
--On Wednesday, August 10, 2005 11:55 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > Here is a result with various degrees of indexing for 2 million entries : > > ftp://ftp.openldap.org/incoming/ScalabilityWithIndexing.png > > As shown in the graph, for 13 indexing (6 equality, 2 equality & > substring, 5 equality, substring & approximate), it took almost a day > (only less one hour) without the Berkeley DB scalability patch while it > took only six hours with the patch. It would have been decreased further > if a separate indexing were performed by using slapindex after the > initial slapadd. Although you tested with 21 indexing, the number of > substring/approximate indexing are the same at seven in both cases. The > effect of additional equality indexing was found negligible through my > performance evaluation. > > As a result, I would say my results contradict yours. With my performance > evaluation configuration, I could not find any tendency that might give > me a hint why you had experienced such a problem. One thing I noticed > from your last result is that you did not run the experiment to > completion. I think it is required to run each run to completion to draw > a meaningful result. Hi Jong, I've run the test to the finish for the unpatched version, and have the patched version running. The unpatched version will beat the patched version by several days, at the current rate things are going. For the unpatched load, the results were: cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 06d08h 2000002 obj 277927.430u 2905.190s 152:22:40.95 51.1% 0+0k 0+0io 684572pf+0w Right now, the *patched* load is at: cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################## ] ( 66%) 06d13h 1331216 obj and doing a few percent every day. I think you are not doing a broad enough test to really understand the validity of your patch, and I suggest testing with more than 13 indices, and letting it run to completion. My indices are: # Indices to maintain index default eq index objectClass index uid index departmentNumber index cn eq,sub,approx index sn eq,sub,approx index l index ou index telephonenumber eq,sub index userpassword index givenName eq,sub,approx index mail index carLicense index employeeType index homephone eq,sub index mobile eq,sub index pager eq,sub index o index roomNumber index preferredLanguage index postalCode index st and DB_CONFIG is: # $Id: DB_CONFIG,v 1.1 2004/04/02 19:27:11 quanah Exp $ set_cachesize 1 713031680 0 set_lg_regionmax 262144 set_lg_bsize 2097152 set_lg_dir /var/log/bdb Regards, Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Quanah, Do you also have a result without "-q" option ? - Jong-Hyuk Quanah Gibson-Mount wrote: > > > --On Wednesday, August 10, 2005 11:55 AM -0400 Jong-Hyuk > <jongchoi@watson.ibm.com> wrote: > >> Here is a result with various degrees of indexing for 2 million >> entries : >> >> ftp://ftp.openldap.org/incoming/ScalabilityWithIndexing.png >> >> As shown in the graph, for 13 indexing (6 equality, 2 equality & >> substring, 5 equality, substring & approximate), it took almost a day >> (only less one hour) without the Berkeley DB scalability patch while it >> took only six hours with the patch. It would have been decreased further >> if a separate indexing were performed by using slapindex after the >> initial slapadd. Although you tested with 21 indexing, the number of >> substring/approximate indexing are the same at seven in both cases. The >> effect of additional equality indexing was found negligible through my >> performance evaluation. >> >> As a result, I would say my results contradict yours. With my >> performance >> evaluation configuration, I could not find any tendency that might give >> me a hint why you had experienced such a problem. One thing I noticed >> from your last result is that you did not run the experiment to >> completion. I think it is required to run each run to completion to draw >> a meaningful result. > > > Hi Jong, > > I've run the test to the finish for the unpatched version, and have > the patched version running. The unpatched version will beat the > patched version by several days, at the current rate things are going. > > For the unpatched load, the results were: > > cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif > [########################################] (100%) 06d08h 2000002 obj > 277927.430u 2905.190s 152:22:40.95 51.1% 0+0k 0+0io 684572pf+0w > > > Right now, the *patched* load is at: > > cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif > [########################## ] ( 66%) 06d13h 1331216 obj > > and doing a few percent every day. > > I think you are not doing a broad enough test to really understand the > validity of your patch, and I suggest testing with more than 13 > indices, and letting it run to completion. > > My indices are: > > # Indices to maintain > index default eq > index objectClass > index uid > index departmentNumber > index cn eq,sub,approx > index sn eq,sub,approx > index l > index ou > index telephonenumber eq,sub > index userpassword > index givenName eq,sub,approx > index mail > index carLicense > index employeeType > index homephone eq,sub > index mobile eq,sub > index pager eq,sub > index o > index roomNumber > index preferredLanguage > index postalCode > index st > > and DB_CONFIG is: > > # $Id: DB_CONFIG,v 1.1 2004/04/02 19:27:11 quanah Exp $ > set_cachesize 1 713031680 0 > set_lg_regionmax 262144 > set_lg_bsize 2097152 > set_lg_dir /var/log/bdb > > Regards, > Quanah > > -- > Quanah Gibson-Mount > Product Engineer > Symas Corporation > Packaged, certified, and supported LDAP solutions powered by OpenLDAP: > <http://www.symas.com> > > > >
--On Sunday, August 28, 2005 1:30 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > Quanah, > Do you also have a result without "-q" option ? Hi Jong, I do not have a result without the -q option for this particular test. However, like I've noted previously, I see the same results you do for your patch whenever I keep the number of indices low, and that is with the -q option. I can't see any reason why -q would affect BDB's internal memory allocation, and since I'm only getting differences from your results by increasing the indexing, I seriously doubt -q is affecting it. Given that the usefulness of -q has been quite proven in a large number of environments (10k dbs, 50 million entry dbs, all with varying numbers of indices) I don't have much incentive to not use it, either. If I can continue to have access to the borrowed linux box I'm using for these tests, I may set up one without -q, although I really doubt the end result will change, and it may take several weeks to finish each of them, because I'll be losing the known performance boost -q offers. BTW, current progress with your patch: root:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [################################### ] ( 89%) 01w04d 1788432 obj --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Sunday, August 28, 2005 12:54 PM -0700 Quanah Gibson-Mount <quanah@symas.com> wrote: > BTW, current progress with your patch: > > root:/tmp/ldif# time slapadd -q -l 2M-peons.ldif > [################################### ] ( 89%) 01w04d 1788432 obj Final load time with the BDB scalability pach: cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 02w05h 2000002 obj \HH 6485.120u 5834.390s 341:15:40.01 1.0% 0+0k 0+0io 981489pf+0w So more than twice as long the 152:22:40.95 time it took to load the database without the patch. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
> > >I think you are not doing a broad enough test to really understand the >validity of your patch, and I suggest testing with more than 13 indices, >and letting it run to completion. > > I tested further by adding six more equality indices to the 13 index case. The result reiterates my previous result. As I told you before, the effects of adding equality indices are minor. It took 415 minutes for slapadd with six more equality indices to complete. That is only 50 minutes more than 365 minutes taken with 13 indices. Here is a brief comparison of the index settings : Quanah's Jong-Hyuk's -------------------------------------------------------------------- equality only 14 12 eq, substring 4 2 eq, substring, approx 3 5 -------------------------------------------------------------------- Total 21 19 I would say that the degree of indexing in the two setups are comparable. (Your setting has two more indices but that can be compensated by two additional approximate indices in my setting.) >My indices are: > ># Indices to maintain >index default eq >index objectClass >index uid >index departmentNumber >index cn eq,sub,approx >index sn eq,sub,approx >index l >index ou >index telephonenumber eq,sub >index userpassword >index givenName eq,sub,approx >index mail >index carLicense >index employeeType >index homephone eq,sub >index mobile eq,sub >index pager eq,sub >index o >index roomNumber >index preferredLanguage >index postalCode >index st > >and DB_CONFIG is: > ># $Id: DB_CONFIG,v 1.1 2004/04/02 19:27:11 quanah Exp $ >set_cachesize 1 713031680 0 >set_lg_regionmax 262144 >set_lg_bsize 2097152 >set_lg_dir /var/log/bdb > >Regards, >Quanah > >-- >Quanah Gibson-Mount >Product Engineer >Symas Corporation >Packaged, certified, and supported LDAP solutions powered by OpenLDAP: ><http://www.symas.com> > > > > > > >
> > Final load time with the BDB scalability pach: > > cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif > [########################################] (100%) 02w05h 2000002 obj > \HH > 6485.120u 5834.390s 341:15:40.01 1.0% 0+0k 0+0io 981489pf+0w > > > So more than twice as long the 152:22:40.95 time it took to load the > database without the patch. I recommend you to confirm your system setup first. Assuming that you are experimenting on a reasonably up-to-date platform, more than 6 days for adding 2 million entries seems too much even without the BDB scalability patch. When I compare your numbers with my results, your configuration is 6.6x slower than mine originally (before applying BDB scalability patch). Fair comparison could follow after first bring the baseline performance in sync. - Jong-Hyuk ------------------------ Jong Hyuk Choi IBM Thomas J. Watson Research Center - Enterprise Linux Group P. O. Box 218, Yorktown Heights, NY 10598 email: jongchoi@us.ibm.com (phone) 914-945-3979 (fax) 914-945-4425 TL: 862-3979
--On Thursday, September 01, 2005 10:03 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > I recommend you to confirm your system setup first. Assuming that you are > experimenting on a reasonably up-to-date platform, more than 6 days for > adding 2 million entries seems too much even without the BDB scalability > patch. When I compare your numbers with my results, your configuration is > 6.6x slower than mine originally (before applying BDB scalability patch). > Fair comparison could follow after first bring the baseline performance > in sync. Please define "system setup". Operating system parameter changes? slapd.conf changes? I'm not really clear what you mean here... OpenLDAP and BDB are configured in a standard method, and there's nothing that jumps out at me as needing changing. So anything you want to suggest here I'll look at. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Thursday, July 28, 2005 2:56 PM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > >> Hi Jong, >> >> I don't think you read my email very closely. >> >> 1) I noted I don't have systems with large amounts of RAM to test >> large databases on. > > Although I'm testing the BDB scalability patch on a machine having 12GB > RAM, the BDB cache size is set to only 1.6GB. It's a 32-bit platform. > What is obvious between the lines of the performance results of the BDB > scalability patch is that the patch makes it possible to add large > databases in memory constrained systems which is typical in small to > medium scale servers such as IBM BladeCenter and OpenPower. Hence, you > should not consider not having a large amount of RAM as an inhibitor to > large database experiments. I have to say that you are very wrong here. You must consider not having a sufficient amount of overhead RAM as an inhibitor to large database environments. I've moved from doing my testing of the 2 million DB database with a 1.6GB DB Cache from a system with 2GB of memory to a system with 3GB of memory. There is an extreme difference in performance. So while your patch might make things work better in a limited memory environment (I still have yet to see any evidence of this, but I continue to experiment), there are still requirements for some amount of overhead memory availability that must be met. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Thursday, September 01, 2005 10:03 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > >> >> Final load time with the BDB scalability pach: >> >> cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif >> [########################################] (100%) 02w05h 2000002 obj >> \HH >> 6485.120u 5834.390s 341:15:40.01 1.0% 0+0k 0+0io 981489pf+0w >> >> >> So more than twice as long the 152:22:40.95 time it took to load the >> database without the patch. > > > I recommend you to confirm your system setup first. Assuming that you are > experimenting on a reasonably up-to-date platform, more than 6 days for > adding 2 million entries seems too much even without the BDB scalability > patch. When I compare your numbers with my results, your configuration is > 6.6x slower than mine originally (before applying BDB scalability patch). > Fair comparison could follow after first bring the baseline performance > in sync. After moving to a new server, and running the 13 indices case, I do see some improvement in time to add via your patch. With the patch: /tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 03d04h 2000002 obj 5123.990u 2264.730s 76:14:54.16 2.6% 0+0k 0+0io 423456pf+0w Without the patch: /tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 03d15h 2000002 obj 205538.150u 1335.460s 87:48:16.93 65.4% 0+0k 0+0io 438073pf+0w So a total savings of approximately 11 hours 30 minutes. I will go redo the 21 indices case. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
--On Thursday, September 01, 2005 10:03 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > >> >> Final load time with the BDB scalability pach: >> >> cgi3-linux:/tmp/ldif# time slapadd -q -l 2M-peons.ldif >> [########################################] (100%) 02w05h 2000002 obj >> \HH >> 6485.120u 5834.390s 341:15:40.01 1.0% 0+0k 0+0io 981489pf+0w >> >> >> So more than twice as long the 152:22:40.95 time it took to load the >> database without the patch. > > > I recommend you to confirm your system setup first. Assuming that you are > experimenting on a reasonably up-to-date platform, more than 6 days for > adding 2 million entries seems too much even without the BDB scalability > patch. When I compare your numbers with my results, your configuration is > 6.6x slower than mine originally (before applying BDB scalability patch). > Fair comparison could follow after first bring the baseline performance > in sync. After moving to a system with 3GB of memory instead of 2GB, I actually see some performance gain from your patch: tribes:~> cat 13indices-time-base sundial1:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 03d15h 2000002 obj 205538.150u 1335.460s 87:48:16.93 65.4% 0+0k 0+0io 438073pf+0w tribes:~> cat 13indices-time-patched sundial1:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 03d04h 2000002 obj 5123.990u 2264.730s 76:14:54.16 2.6% 0+0k 0+0io 423456pf+0w Or a savings of 11.5 hours in the 13 indices case. tribes:~> cat 21indices-time-base sundial1:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 04d04h 2000002 obj 232486.230u 1511.880s 100:56:51.30 64.3% 0+0k 0+0io 437784pf+0w tribes:~> cat 21indices-time-patched sundial1:/tmp/ldif# time slapadd -q -l 2M-peons.ldif [########################################] (100%) 03d17h 2000002 obj 5709.420u 2511.500s 89:19:34.61 2.5% 0+0k 0+0io 429203pf+0w Or a savings of 12 hours in the 21 indices case. So in both cases, there is a moderate decrease in the amount of time it took to load the database. Based on the differences in how your patch performs on the 2GB & 3GB systems, it appears that the more memory BDB has available to it *past* the DB_CACHE, the better it runs. If there is too little space there, the algorithm is substantially worse than the current BDB memory allocator. If there is enough space, it is faster. I'm betting that the reason it is so fast on your systems is because you have a very large amount of space between the DB_CACHE and the upper memory limit. So on systems with a very small memory pool available, this patch has the possibility of making things worse, rather than better. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
Moving this to ITS. Comments inline. Quanah Gibson-Mount wrote: > > > --On Tuesday, October 04, 2005 9:45 AM -0500 Eric Irrgang > <erici@motown.cc.utexas.edu> wrote: > >> I've been following this thread with great interest but I was >> wondering if >> you could provide more clear numbers on the sizes of the data you are >> working with. I'm trying to figure out where I fit into these scaling >> benchmarks. Could you provide some information on the actual sizes of >> your db files at the large number-of-entries, many indexes, db cache >> size, etc? >> >> For instance, my id2entry file is 6GB and my indexes are an additional >> 3GB for 2 million entries and 47 index files maintaining about twice as >> many attribute/search-type indexes. To get consistently good >> performance >> I have to keep a db cache of at least 6 Gigabytes right now with OL >> 2.2.26 >> and I would be very interested in improving write performance or being >> able to use less cache, but I want to get a better idea of what I should >> be expecting. > > > Hi Eric, > > This ITS mainly has to do with loading data into back-bdb, as opposed > to how it performs long term. In my tests, I used a 1.7GB DB Cache, > indexing 21 attributes out of 28 attributes. As I have shown its search / search-update performance in this message (http://www.openldap.org/its/index.cgi/Development?id=3851;selectid=3851#followup6), it has a significant impact on the normal search / update workload as well as the directory population. > > My /db dir shows: > > 58M carLicense.bdb > 648M cn.bdb > 56M departmentNumber.bdb > 444M dn2id.bdb > 7.7M employeeType.bdb > 335M givenName.bdb > 444M homePhone.bdb > 3.5G id2entry.bdb > 24M l.bdb > 57M mail.bdb > 445M mobile.bdb > 664K o.bdb > 2.6M objectClass.bdb > 664K ou.bdb > 444M pager.bdb > 42M postalCode.bdb > 20M preferredLanguage.bdb > 27M roomNumber.bdb > 419M sn.bdb > 20M st.bdb > 443M telephoneNumber.bdb > 56M uid.bdb > 664K userPassword.bdb > > > > All of my testing indicates that your DB_CACHE while running (not > slapadding) must be at least as big as your id2entry bdb file, > regardless of this patch. Again, numbers please. > > --Quanah > > -- > Quanah Gibson-Mount > Product Engineer > Symas Corporation > Packaged, certified, and supported LDAP solutions powered by OpenLDAP: > <http://www.symas.com> > >
--On Wednesday, October 05, 2005 11:07 AM -0400 Jong-Hyuk <jongchoi@watson.ibm.com> wrote: > Moving this to ITS. Comments inline. > >> This ITS mainly has to do with loading data into back-bdb, as opposed >> to how it performs long term. In my tests, I used a 1.7GB DB Cache, >> indexing 21 attributes out of 28 attributes. > > As I have shown its search / search-update performance in this message > (http://www.openldap.org/its/index.cgi/Development?id=3851;selectid=3851# > followup6), it has a significant impact on the normal search / update > workload as well as the directory population. I suggest reading followup #1 where I show a 10 times worse performance with your patch. Again, I think the fact that your test systems have large amounts of RAM beyond what you are setting the DB_CONFIG file are influencing your results, and are not reflective of real world resource starved systems. >> My /db dir shows: >> >> 58M carLicense.bdb >> 648M cn.bdb >> 56M departmentNumber.bdb >> 444M dn2id.bdb >> 7.7M employeeType.bdb >> 335M givenName.bdb >> 444M homePhone.bdb >> 3.5G id2entry.bdb >> 24M l.bdb >> 57M mail.bdb >> 445M mobile.bdb >> 664K o.bdb >> 2.6M objectClass.bdb >> 664K ou.bdb >> 444M pager.bdb >> 42M postalCode.bdb >> 20M preferredLanguage.bdb >> 27M roomNumber.bdb >> 419M sn.bdb >> 20M st.bdb >> 443M telephoneNumber.bdb >> 56M uid.bdb >> 664K userPassword.bdb >> >> >> >> All of my testing indicates that your DB_CACHE while running (not >> slapadding) must be at least as big as your id2entry bdb file, >> regardless of this patch. > Again, numbers please. I recently posted a discussion around this on the -devel list. And Eric's on response agrees with my observations. --Quanah -- Quanah Gibson-Mount Product Engineer Symas Corporation Packaged, certified, and supported LDAP solutions powered by OpenLDAP: <http://www.symas.com>
changed notes changed state Feedback to Closed
moved from Development to Archive.Development
Needs review Needs to be coordinated with upstream (Sleepycat) obsoleted, new allocator in BDB 4.6.