[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#6470) slapd dying under load



Full_Name: Bryan Maupin
Version: 2.4.21
OS: Red Hat Enterprise Linux 5.4
URL: http://omega.uta.edu/~bmaupin/bryan-maupin-100210.txt
Submission from: (NULL) (129.107.38.77)


About 2 months ago we upgraded from OpenLDAP 2.3 to OpenLDAP 2.4.19.  All seemed
well for about 4 weeks, and then slapd started dying on our replica servers.  It
frequently correlates with a CPU load spike, but not always.  Available memory
doesn't seem to be an issue.  These problems didn't show up during testing, and
don't seem to have affected the master, which has been running without stopping
for the last 2 months, so the problems only seem to manifest themselves under
production load.  In our /var/log/messages we get messages like:

Jan 15 16:58:03  kernel: slapd[22663] general protection 
rip:2ac667af65cc rsp:4659d970 error:0
Jan 16 15:42:34  kernel: slapd[10272] general protection 
rip:2b71f6c6d9c4 rsp:45441980 error:0
Jan 16 17:20:01  kernel: slapd[7538] general protection rip:2aec51954ae0 
rsp:449518d0 error:0
Jan 19 13:38:46  kernel: slapd[2821] general protection rip:2aeac3070ae0 
rsp:4918f8d0 error:0

Quanah Gibson-Mount told me to upgrade OpenLDAP and tcmalloc, so I did, but it
didn't fix the problem.

We're running OpenLDAP 2.4.21 on RHEL 5.4, with Heimdal 1.2.1-3, OpenSSL 0.9.8k,

Cyrus-SASL 2.1.23, BDB 4.7.25 (with patches), Google tcmalloc (minimal) 1.5.

Backtrace is available here:
http://omega.uta.edu/~bmaupin/bryan-maupin-100210.txt

Thanks.