[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: No replication after power failure

Stelios Grigoriadis wrote:
On Wed, 2007-10-03 at 16:08 +0200, Pierangelo Masarati wrote:
Stelios Grigoriadis wrote:
I am not sure this would be considered a bug, but it is a problem for
us. If the master goes down, the replicas have no way of detecting it.
When the master is going back up again, all replica servers have to be
restarted. Is there a way to avoid this?

Using the KEEPALIVE option (socket or TCP) is not really an option since
the default timeout is 2 hours which is too long.

Another would be to have some kind of timeout in the epoll and check if
the master is responding, but that timeout is used for the runqueue?

Have you come across this? I was surprised to see that no one has had
any issues with it. Am I missing something?
This was recently discussed (ITS#5133), and the only alternative to SO_KEEPALIVE would be to have some background thread poll the producer on the syncrepl descriptors on a regular basis performing some no-op (like searching the rootDSE requesting 1.1). Aaron Richton noted that support for SO_KEEPALIVE was added in OpenLDAP 2.3.28.


Ing. Pierangelo Masarati
OpenLDAP Core Team

SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
Office:  +39 02 23998309
Mobile:  +39 333 4963172
Email:   pierangelo.masarati@sys-net.it

I have solved the problem by inserting a periodic check in the runqueue (called do_mastercheck). The intervall is determined by a slapd.conf parameter (mastercheckint) in the syncrepl section. The parameter is optional. If it's not specified, it's not inserted in the runqueue. I have tested the code and it seems to work.

The do_mastercheck function just does a dummy search against the master.
I'm supplying a patch (only syncrepl.c is affected) so you can hopefully
improve and incorporate the solution in the code.


Thanks. But please could you read the following:


Kind Regards,

Gavin Henry.
OpenLDAP Engineering Team.

E ghenry@OpenLDAP.org

Community developed LDAP software.