[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Disaster recovery question wrt replication

To: "Steven G. Harms" <stharms@cisco.com>
Subject: Re: Disaster recovery question wrt replication
From: "matthew sporleder" <msporleder@gmail.com>
Date: Thu, 14 Dec 2006 11:15:02 -0500
Cc: openldap-software@openldap.org, Aaron Richton <richton@nbcs.rutgers.edu>
Content-disposition: inline
In-reply-to: <20061214051517.GB26709@cisco.com>
References: <2CCFC74C714C4340B3F699F017F3419702B50383@xmb-sjc-212.amer.cisco.com> <Pine.SOC.4.64.0612131925230.24779@toolbox.rutgers.edu> <20061214051517.GB26709@cisco.com>

On 12/14/06, Steven G. Harms <stharms@cisco.com> wrote:

Replies inline:

On Wed, Dec 13, 2006 at 07:36:22PM -0500, Aaron Richton wrote:
>Use of slurpd certainly isn't a modern architecture, especially with
>concern for rapid DR, for reasons including your question at hand. Look
>into syncrepl (make sure to upgrade to 2.3.30 first) and mirrormode.

Due to reasons beyond my control, I'm currently at OpenLDAP version slapd
2.2.13 (Aug 18 2005 22:22:34).  As such syncrepl, based on your
versioning guideline below isn't an option.

>I would argue that the way to deal with HA/DR with OpenLDAP is to install
>enough slaves that you feel the odds of complete failure are tolerable,
>....

I have built replicas in different data centers etc.  So I'm going for a
'multiple replica' theory of backup.  The DR need not be instant, nor does it
need to be perfectly synchronized.

In light of these constraints, would it be appropriate to use slurpd for
replication?

Assuming slurpd for replication, what are the answers to these questions:

1.  Can I assign multiple replicas

2.  How can I clean out the replog back-queue to a pristine start?

3.  I suppose, more generally, I'm asking: How do a start replication all
over.  What files can / should I delete.  Which should I not under any
circumstance touch?

Should i delete the slurpd.replog.lock file or/and the slurpd.replog? Should i
just cat /dev/null into them?  How can I start over?  And further, how do i
rotate the replog as it starts to eat up more filesystem.

For the record I'm using a BDB back end.

While I am also planning on a switch to syncrepl, I can only make so
many changes at a time, so I'm stuck in this boat with you for a
while.  I found at the NYC BSD CON that openldap HA is a very common
questions for people, so here's a rough outline of how I told people
to use slurpd for HA.  (after saying how much easier syncrepl was, of
course)

Run at least two instances of slapd on your master server.  This
allows you to build new replicas without much downtime of existing
replicas or the master.  You can start backlogging a new replica from
a point-in-time slapcat of another downed replica.

Use clustering (like veritas on a san, bdb doesn't like nfs if I
remember correctly) to manage failing over the --entire-- database to
another physical server if that one craps out.

If you can't use a cluster, or your san fails, or whatever, you can
also promote one of your replicas to be a master by moving over the
config files and, if needed, add the IP from your previous master.
   When you fail over the slurpd configs, remove the old master since
you will have to rebuild it from scratch.  You can move the old
replication logs over and run them in one-shot mode on the new master
before you start it up.  That should help keep you in-sync.  You can
also manually edit the replog to change the server ip, if needed.

Now, rebuild your master from scratch.  Now shutdown and slapcat a
replica using well-timed slurpd config changes to keep everything
in-sync, and then follow the above procedures for promoting it back to
the master.

References:
- Disaster recovery question wrt replication
  - From: "Steven Harms \(stharms\)" <stharms@cisco.com>
- Re: Disaster recovery question wrt replication
  - From: Aaron Richton <richton@nbcs.rutgers.edu>
- Re: Disaster recovery question wrt replication
  - From: "Steven G. Harms" <stharms@cisco.com>

Prev by Date: Re: howto create base dn??? postgres(8.1.4)+slapd(2.2.23)
Next by Date: Re: Disaster recovery question wrt replication
Index(es):
- Chronological
- Thread