[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: bdb bad performance again (really need help)



Hi Matt,

Thanx a lot for repyling. We increased cache size up to the value enough to keep all data in memory. This resulted in much better performance. However I still have odd feeling about it as I don't really understand why performance descreased so much (several orders of magnitude) when cache size was small. This is due to my miserable knowledge of berkeley db. And why preformance decreased in irreversable manner? Why even restart of LDAP server does not bring performance to proper value? Do you know what exactly happens to bdb backend when cache size is too small? You see, I feel unconfortable if someone tells me "put contents of your backend in memory and everything will be fine", because

1) I don't understand why should I do it ... Look there must be an error somewhere, if performance degradates so much suddenly and does not recover . Why db_recover "recovers" performance?
2) I can not increase cache size as fast as ammount of data in our LDAP server grows ...


Any ideas?

Thanx a lot and best regards, vadim tarassov.

Matt Richard wrote:

Hi,

I was also getting very bad performance, after a least a day of great performance. My culprit was the amount of cache memory reserved for this bdb database. When I ran db_stat -m, I looked at the "Requested pages found in the cache" vs. the "Requested pages not found in the cache" and found that most pages were not being found in the cache, therefore bdb was constantly reading from disk instead of memory.

Here's what I did to fix it: I pushed the value up so high that my database now resides completely in memory, and bdb never needs to read from disk once the whole database is in memory. Here's a partial output from db_stat on this server:

   62MB 513KB 752B Total cache size.
   1       Number of caches.
   62MB 520KB      Pool individual cache size.
   0       Requested pages mapped into the process' address space.
   114M    Requested pages found in the cache (100%).
   1704    Requested pages not found in the cache.

As you can see, everything's been cached, and performance has been fantastic ever since. You may need to adjust the memory settings for your environment. This is for our database which supports about 3000 users, and the slapcat file created from the database is almost 7 megabytes. Also, this server has over a gigabyte of physical memory, so I can afford to load the whole database into memory.

Here's my db_config if you are interested. I picked these values after reading the following page: http://sapiens.wustl.edu/~sysmain/info/openldap/openldap_configure_bdb.html .

   # Set the database in memory cache size.
   #
   set_cachesize   0       52428800        0

   #
   # Set database flags.
   #
   set_flags       DB_TXN_NOSYNC

   #
   # Set log values.
   #
   set_lg_regionmax        1048576
   set_lg_max              10485760
   set_lg_bsize            2097152

   #
   # Set temporary file creation directory.
   #
   set_tmp_dir             /tmp

Once you make these changes, you will want to either export and delete and import the database. Or maybe a db_recover would do the job too, I don't know.

I hope you can find this helpful!

-Matt

---
Matt Richard
Access and Security Coordinator
Franklin & Marshall College
matt.richard@fandm.edu

'

Hallo everybody,

It is mainly the same message, previously posted by other users (see for example

http://www.openldap.org/lists/openldap-software/200309/msg00389.html).

We could not find solution for this problem in the list, hence decided to inform the list about this problem again.

We observe very bad performance of LDAP server (2.1.8, 2.1.22, 2.1.23) under load test conditions. Performance stagnates very fast with increasing number of concurrent LDAP clients (well, just 8 of them kill LDAP server), and remains very bad even after load is removed. Restarting LDAP server does not bring anything positive. However using db_recover we can restore performance (similar to what has been reported by Peter Johnson).

Howard Chu suggested:

1) Read Berkeley tuning documents (tbd)

2) take results of db_stat -c and db_stat -m commands and compare them.

We did it and we are ready to do it again if anyone on the list will be ready to help us to solve this problem. However we have not observed anything particulary odd (from our dumb user's point of view) when looking on db_stat output before and after load test. db_stat did not show anything after db_recover has been applied.

Suggestion to modify DB_CONFIG file caused following feedback from Peter:
=======================================================================
I setup the DB_CONFIG file and it helps. The fast query scripts still start
to slow down but not as soon as before. Also the scripts with the one second
delays ran without problems for 14,000 queries where before the started
having problems at about 7000 queries.


There must still be something wrong with the software since I still have to
use db_recover to get response time back up to normal after a slowdown.
========================================================================


We have not tried to make corresponding changes in the file as we trust observations of Mr. Johnson and his conclusion.

Is there anybody who can suggest something to figure out what went wrong?

Thanx a lot in advance, Vadim Tarassov.