[Date Prev][Date Next] [Chronological] [Thread] [Top]

load problem in mirror mode



Hello list,

I've set up 2 servers in mirror mode, running version 2.4.10 from debian unstable (2.4.10-3). Those two servers are only accessed by a proxy openldap server, version 2.3.30-5 from debian stable, using the back_ldap backend. This setup makes the proxy server to use one of the backend servers at a time, falling back to the other node in the case that the current backend is no longer available. So far so good, on to the problem;

From time to time, one of the nodes starts to get high CPU loads. This happens in a random way, sometimes it's the current backend that starts to get the high load, sometimes it's the "stand by" server that starts to get the high load.

Strangely the high load isn't like i've seen happen before, in other setups, with slapd consuming constantly 99% of cpu resources. A top command shows variations of CPU usage, making one think that it's just a peak of load, but it really doesn't go away. A simple restart of the daemon makes things all right again.

Another strange thing is that if the load happens on the "stand by" server, and although the current backend's load is just fine, the queries start to get affected and start taking 3-4 seconds or more to get the results.

I do know that the problem isn't with the proxy server since there's no obvious variation on the used resources and the problems persist if i issue the queries directly to the backend servers.

Both the mirror nodes and the proxy are set on Xen virtual machines, all the machines have their clock in sync (through ntpd on dom0).

When things are fine, the usual load on the machines never exceeds 0.20.

The proxy server has 2 vcpus and 1GB of ram, while the backend machines have 3 vcpus and 2GB of ram. They aren's all in the same physical server and there isn't any kind of resource overbooking so far (neither mem or cpu) on the host systems they are running.

The database occupies a bit less than 700MB on disk, and both backends have a defined cachesize of 1750MB, which is more than enough. I also have never see the system use any swap so far.

From the logs, while using log level of stats + sync i cannot see any irregularity.

I've searched the ITS database but could not find any report that would mach these criteria.

If i've neglected any information, please advise.

Any help would be much appretiated.

Regards,

Hugo Monteiro.

--
ci.fct.unl.pt:~# cat .signature

Hugo Monteiro
Email	 : hugo.monteiro@fct.unl.pt
Telefone : +351 212948300 Ext.15307

Centro de Informática
Faculdade de Ciências e Tecnologia da
		   Universidade Nova de Lisboa
Quinta da Torre   2829-516 Caparica   Portugal
Telefone: +351 212948596   Fax: +351 212948548
www.ci.fct.unl.pt	      apoio@fct.unl.pt

ci.fct.unl.pt:~# _