[Date Prev][Date Next] [Chronological] [Thread] [Top]
RE: syncbackup

To: Howard Chu <hyc@symas.com>
Subject: RE: syncbackup
From: Masato Taruishi <taru@valinux.co.jp>
Date: Tue, 28 Oct 2003 14:38:11 +0900
Cc: "'OpenLDAP Development'" <openldap-devel@OpenLDAP.org>
In-reply-to: <003001c39ce7$e5d1b490$8601a8c0@CELLO>
References: <003001c39ce7$e5d1b490$8601a8c0@CELLO>
> >  * At least backup
> >
> >     This is a way to propagates an operation to several backup servers
> >     at the same time and wait for at least one success result.
> >
> >        +------+
> >        |Master| <--- SUCCESS ------------+
> >        +------+                          |
> >           | |                            |
> >           | +------------------+         |
> >           |                    |         |
> >       propagation          propagation   |
> >         XXXXX                  |         |
> >        XhangupX               \ /        |
> >         XXXXX               +------+     |
> >        +------+             |Backup| +---+
> >        |Backup|             +------+
> >        +------+
> >
> >    So even if one backup server hangs up or delay because of
> > some reason
> >    such as network trouble, the master server can return the
> > success to
> >    the client as soon as possible.
> 
> Of course you must queue the updates to the other servers. And if one backup
> responds for one update, but a different backup handles the next update, then
> you have lost consistency if there are still updates in the queue.

Yes. So each update is sent to all the backup servers within a topology.
The master manages a history information of all the backup servers upto
where each backup server gets the result. 

Therefore, at least one backup server keeps the same directory as the
master server. Other servers will be synchoronized later.

> >  * Boot-time synchronization using syncrepl()
> >
> >     When another backup server fail over the master service,
> > the backup
> >     server fetches the changes to other servers by using one-time
> >     syncrepl.
> >
> >        +------+
> >        |XXXXXX|
> >        +------+
> >                                  New Master
> >                   syncrepl  +------+
> >        +------+    +----->  |Backup|
> >        |Backup| ---+        +------+
> >        +------+
> >
> >     The new server should be up-to-date within the available servers.
> 
> I don't understand the above statements.

Because of the above strategy to send the changes, the new master may
not be the same as the old master server. However, there is at least
one backup server which is the same as the old master server, somewhere
within a topology ( as long as the server is available in failover time
). So, the new master looks for the server by using LDAP SYNC for all
the servers within a topology.

The following diagram shows backup1 is the same as the old master
server, but backup2 and backup3 is not. "latency = 0" in backup1
means that it has the same directory as the old master server.
"latency = 5" in backup2 is that the latency of the backup2 is
5 from the old master. 5 updates have not been sent to the backup2
yet.

             Old Master
        +------+
        |XXXXXX|
        +------+
                                  New Master
      latency = 0  syncrepl   +-------+
        +-------+    +----->  |Backup3|
        |Backup1| ---+        +-------+
        +-------+        +-->    latency = 2
                         |
      latency = 5        |
        +-------+        |
        |Backup2| -------+
        +-------+     syncrepl


In order to make the directory of the new master the latest directory,
it must look for the latest directory from the topology.
It may be better that backup3 searches contextCSN from backup1 and
backup2. But currently, it uses syncrepl to all the servers because
the entry which includes contextCSN attribute can't be searched by
filter and therefore I must specify the location of the subentry 
directly. I have no idea that it is a good idea.

If the topology can recoginize what server is latency = 0, backup1
can become the new master directly without using LDAP SYNC. However,
it may need some negotiation protocol whichi server become the new
master and it may make failback difficult.

> >  These are the ohter extensions to construct syncbackup:
> >
> >       * startSYNC extended operation
> >
> >          This is a LDAPv3 extended operation to start syncbackup.
> > 	 It sends syncCookie and syncid and then the master server
> > 	 attaches it to the apropriate list and propagates
> > 	 LDAP operations to the backup server.
> >
> >
> >             +------+
> >             |Master| ---- propagation +
> >             +------+                  |
> >               / \                     |
> >                |                      |
> >       startsync operation             |
> >                |                      |     Backup server
> >      +----------------------------------------+
> >      |         |                  +------+    |
> >      |    +---------+             | Core |    |
> >      |    |startSYNC|             +------+    |
> >      |    +---------+                         |
> >      |                                        |
> >      +----------------------------------------+
> >
> >          If the backup server returns error, the master sends the
> >          SYNC_ERROR response to the startSYNC operation and
> > detaches the
> >          backup server.
> >
> >          If the master server exits, the master sends the SYNC_DONE
> > 	 response to the startSYNC operation.
> 
> I don't understand the purpose of this operation. A regular Sync request
> should suffice.

Sync request is interesting and cool, but it is a part of search
operation. The master can't check whether the backup server handles
the operation successfully or not if it uses Sync request.

> >       * NOOP extended operation
> >
> >          This is a LDAPv3 extended operation to check whether
> > the server
> > 	 works or not. If the server gets the operation, it returns just
> > 	 LDAP_SUCCESS.
> 
> There is already a NOOP control, so I don't see the need for an exop.

Yes, I know. At first I tried to use NOOP control or search protocol,
but I was worried about overhead. In addition, I wonder the semantics of
search operation is appropriate for a sort of helath check.

> > Even though the implementation has some bugs yet, I can use this
> > replication engine within my OpenLDAP environment. I have to think
> > twice about the following topics:
> >
> >     * considerations about rfc3389.
> >     * multi-master environment.
> 
> I think this approach can be turned into a form of multimaster, with some
> performance loss. If you enforce a strict ordering among all of the servers,
> so that updates received by any of them are applied to all of them in the
> same sequence, then you can ensure consistency. The only advantage over
> single-master is that you don't have to worry about how to promote slaves to
> masters in the event of any failures. Otherwise, the update response time is
> slower than the single-master case, and the overall bandwidth requirement is
> nearly the same.

Yes. I've never tried to use multimaster environment but the rfc
requires that the replication protocol must support multimaster
environment.

Thanks

-- 
Masato Taruishi <taru@valinux.co.jp>
Follow-Ups:
- RE: syncbackup
  - From: Masato Taruishi <taru@valinux.co.jp>
- RE: syncbackup
  - From: "Kurt D. Zeilenga" <Kurt@OpenLDAP.org>
References:
- RE: syncbackup
  - From: "Howard Chu" <hyc@symas.com>
Prev by Date: RE: syncbackup
Next by Date: RE: syncbackup
Index(es):
- Chronological
- Thread