[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: High load times with mdb





--On Tuesday, June 25, 2013 03:10:17 PM -0700 Howard Chu <hyc@symas.com> wrote:

Bill MacAllister wrote:


--On Tuesday, June 25, 2013 12:58:54 PM -0700 Quanah Gibson-Mount <quanah@zimbra.com> wrote:

--On Tuesday, June 25, 2013 12:38 PM -0700 Bill MacAllister <whm@stanford.edu> wrote:

The load starts out at a rate of about 2 M/s.  In the past I remember
that dropping to something like 900 k/s and staying there.  Now the
load starts in the same place, but after 30 seconds it alternates
between stalling out right, and a rate under 100 k/s.  Dips as low as
under 10 k/s and sometimes as high at 700 k/s.  (My undergraduate
degree was in watching water boil.)

What is the partition type? ext4?

What options are set for the partition in fstab?

This is what I am currently using.  The UUID are obviously shortened
for readability.

UUID=blah1 /       ext4    defaults,acl,noatime,errors=remount-ro  0  1
UUID=blah2 /var/cache/openafs  ext4  defaults,noatime        0       2
UUID=blah3 /var/lib/ldap       ext4  defaults,noatime        0       2
UUID=blah4 none                swap  sw                      0       0

I also tried ext3 with the same results.  This is on a raid-1.  I have
also tried splitting the two disks and putting the OS on one and the
LDAP database on the other.  None of this moved the problem.

It really has the feel of a resource exhaustion.  The load is now
stalled in that the progress display is not updating.  top does not
show slapd as doing anything.

Probably bad default FS settings, and changed from your previous OS revision.

Also, you should watch vmstat while it runs to get a better idea of
how much time the system is spending in I/O wait.

I have just re-mkfs'ed the new, slow system to make it look like the
old, fast system.  Just to make sure nothing else changed I have
started a load on the older system.  Things look fine.

Now, comparing vmstat output, the new system is clearly badness incarnate.

Fast
====
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r  b swpd   free    buff  cache   si   so    bi    bo   in   cs us sy id  wa
0  0    0 6358656 303468 9301620   0    0     0     0   45   45  0  0 100  0
0  0    0 6358780 303468 9301620   0    0     0     0   47   41  0  0 100  0
0  0    0 6358780 303468 9301620   0    0     0     0   47   41  0  0 100  0
0  0    0 6358780 303468 9301620   0    0     0     0   93   43  0  1 99   0
0  0    0 6358532 303468 9301620   0    0     0     0  141   71  0  1 99   0
1  0    0 6358488 303468 9301620   0    0     0    14  116   48  0  1 99   0

Slow
====
procs  -----------memory----------  ---swap-- -----io---- -system-- ----cpu----
r  b  swpd    free    buff  cache    si   so    bi    bo   in   cs us sy id wa
1  4    0  13318088  36128 2759600    0    0     0  2134  379   83  0  0 88 12
0  4    0  13318308  36128 2759600    0    0     0  1044  277   70  0  0 88 12
0  4    0  13318508  36132 2759600    0    0     0   765  267   69  0  0 88 12
0  2    0  13318240  36152 2759604    0    0     0   818  593  104  0  0 88 12
0  2    0  13318332  36168 2759604    0    0     0  2611 1489  138  0  0 89 11

Lots of waiting, lots of blocking.  What's the deal with all that
free memory on the slow system?

I will interate on mkfs for a bit, but I thought I would send this off
incase something jumps out.

Bill

--

Bill MacAllister
Infrastructure Delivery Group, Stanford University