[Date Prev][Date Next] [Chronological] [Thread] [Top]


Dear OpenLDAP guys.

First of all, I would like to say sorry about my English as you know,
but I'd like your opinions and/or suggestions about my recent activity
to OpenLDAP which we call syncbackup, that is, to provide
high-availability of master server ( An experimental implementation is 
almost done ) keeping the database consistency.

In order to achieve this kind of HA, syncbackup propagates LDAP
operations for modifications such as ldap_modify, ldap_add,
ldap_delete and ldap_rename, to other LDAP servers before commiting
the changes locally.

                          Master                Backup
 Client -- modify(0) -->    |                     |
   |                        | --- modify(0) --->  |
   |                        |                     |
   |                        | <--- SUCCESS -----  |
   |                        |
   | <---- SUCCESS  -----   |

Only if the master slapd server gets the sucuess results from other
backup servers as well as local commits, it returns the success to the
client. So, if the master returns success, at least two servers have the

If the master server hangs up, another backup server can become another
master server because all changes to the original master server have
been propagated to the new master server.

This is a simple example and has many problems.

First, in this example, if the backup server hangs up, the master server
can't get the response from the backup server. Then if either the master
or the backup hang up, the client can't get result.

Second, when the original master server is restarted, how does this
server follow the latest directory?

To solve these problems, the following ideas were born:

      *) At least backup
      *) Boot-time synchronization using syncrepl()

 * At least backup

    This is a way to propagates an operation to several backup servers
    at the same time and wait for at least one success result.

       |Master| <--- SUCCESS ------------+
       +------+                          |
          | |                            |
          | +------------------+         |
          |                    |         |
      propagation          propagation   |
        XXXXX                  |         |
       XhangupX               \ /        |
        XXXXX               +------+     |
       +------+             |Backup| +---+
       |Backup|             +------+

   So even if one backup server hangs up or delay because of some reason
   such as network trouble, the master server can return the success to
   the client as soon as possible.

 * Boot-time synchronization using syncrepl()

    When another backup server fail over the master service, the backup
    server fetches the changes to other servers by using one-time

                                 New Master
                  syncrepl  +------+     
       +------+    +----->  |Backup| 
       |Backup| ---+        +------+

    The new server should be up-to-date within the available servers.

 These are the ohter extensions to construct syncbackup:

      * startSYNC extended operation

         This is a LDAPv3 extended operation to start syncbackup.
	 It sends syncCookie and syncid and then the master server
	 attaches it to the apropriate list and propagates
	 LDAP operations to the backup server.

            |Master| ---- propagation +
            +------+                  |
              / \                     |
               |                      |
      startsync operation             |
               |                      |     Backup server
     |         |                  +------+    |
     |    +---------+             | Core |    |
     |    |startSYNC|             +------+    |
     |    +---------+                         |
     |                                        |

         If the backup server returns error, the master sends the
         SYNC_ERROR response to the startSYNC operation and detaches the
         backup server.

         If the master server exits, the master sends the SYNC_DONE
	 response to the startSYNC operation.

      * NOOP extended operation

         This is a LDAPv3 extended operation to check whether the server
	 works or not. If the server gets the operation, it returns just

      * contextCSN LDAP control

         This is a LDAPv3 LDAP control. When the master propagates the
	 operation, it puts the appropriate contextCSN value to the
	 operation. If the backup server gots the contextCSN control,
	 it uses the value rather than using lutil_csnstr().

 * Execution example from log message

    These are from the log of my experimental implementation:

    The following log shows that the operation is propagated to the two
    backup servers "hoge" and "hige" and these servers become

       conn=6 op=1 ADD dn="o=University of Michigan,c=US"
       conn=6 op=1 BACKUP sync=1 backup=hige
       conn=6 op=1 BACKUP sync=1 backup=hoge
       conn=6 op=1 RESULT tag=105 err=0 text=

    The next log shows the operation conn=8, op=142 is propagated to
    backup "hoge" at first at (1) and returns a success to the client at
    (2). The backup "hige" is delayed and up-to-date at (4).

       conn=8 op=142 ADD dn="cn=Test141,ou=People,o=University of M..
       conn=8 op=74 BACKUP sync=0 backup=hige
       conn=8 op=75 BACKUP sync=0 backup=hige
  (1)  conn=8 op=142 BACKUP sync=1 backup=hoge
       conn=8 op=76 BACKUP sync=0 backup=hige
       conn=8 op=143 ADD dn="cn=Test142,ou=People,o=University of M..
  (2)  conn=8 op=142 RESULT tag=105 err=0 text=

  (3)         < Suspending "hoge" .... >

       conn=8 op=77 BACKUP sync=0 backup=hige
       conn=8 op=78 BACKUP sync=0 backup=hige
       conn=8 op=79 BACKUP sync=0 backup=hige


       conn=8 op=141 BACKUP sync=0 backup=hige
  (4)  conn=8 op=142 BACKUP sync=1 backup=hige
  (5)  conn=8 op=143 BACKUP sync=1 backup=hige
  (6)  conn=8 op=143 RESULT tag=105 err=0 text=

    Because the "hoge" is suspended at (3), the master server can't get
    any result from "hoge" and "hige". So the master waits for the
    "hige" to be up-to-date until (4). After "hige" is up-to-date, the
    master sends the operation "143" to the "hige" and gets result at
    (5). After (5), because the master gets one success result, it sends
    RESULT err=0 to the client at (6).

Even though the implementation has some bugs yet, I can use this
replication engine within my OpenLDAP environment. I have to think
twice about the following topics:

    * considerations about rfc3389.
    * multi-master environment.
    * ant others??

I hope any opinions, suggestions or notices. If there is no important
problem, I'd like to merge this feature to OpenLDAP itself. We will
release this little big patch under the same term with OpenLDAP Public

Thanks for your excellent efforts and also for reading this mail.

Masato Taruishi <taru@valinux.co.jp>