[Date Prev][Date Next]
RE: OpenLDAP high CPU usage when performing mass changes
So in my decreasing order of preference due to decreasing
accuracy/easiness to setup.
If your kernel is recent enough and support perf-events you
could try to use perf to accurately know where the CPU is
If it does not, you could try oprofile even though that's
more complex to setup than perf.
When you don't have those tools ready to be used, you can
use the poor man profiling tool, i.e. sample that backtrace of slapd in loop
If you do not have pstack, you can achieve the same in a
more heavy weight manner with gdb.
If you do not have gdb, you may try
If you don't have those, you should ask for some linux
sysadmin help ;-)
We are using openldap 2.4.26 with BDB 4.8 and have replication set up
in mirror mode for our main ldap database. There are a couple of other replicas
that have a subset of the data that the main cluster has but we are seeing the
following behavior on all of them.
When performing mass updates via LDAP, lets say on the order of 30,000
entries being added to existing entries. We've noticed that the CPU use of the
slapd instances goes through the roof (between 65% and 95% continuously),
and seems to stay there until it is restarted.
The Problem is that this system has to be highly available, even for
writing and when these updates "shock" the system, the response time goes way
down when the process are turning like that. I don't think they are trying to
catch up to the data changes because if I let them run a while after the updates
are done. (Talking like 1hr) and then restart the instances, they go back to
their normal state.
So far the only way I've been able to mitigate the issues is to reconfigure
our ldap proxy instances to a machine that is having less trouble, restart the
instances that are chugging along, then repoint the proxies back to the one just
started, and start the others. Not exactly a quick operation.
I've played with cache settings for both OpenLDAP and BDB and have gotten
the frequency of this issue reduced but I can't seem to get rid of
it completely and it shows up quite often after large data
manipulations. I'm at a loss of how to debug since nothing is crashing. Any
suggestions on how to find out what's causing this would be very helpful. The
logs are not throwing any warnings or posting messages that would seem out of
the ordinary and I have played with the log settings but nothing seems to relate
to anything that might explain why we are seeing CPU usage to go so high.
Thanks in advance
I fly because it releases my mind from the tyranny of
petty things . . .
? Antoine de Saint-Exupéry
ITS Application Administrator (IDM)