[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Syncrepl, syncprov, and contextCSN.

(This is the continuation of a direct email conversation with Rein Tollevik that instead belongs on this list)


On Oct 22, 2009, at 6:01 AM, Rein Tollevik wrote:

Tim Stewart wrote:

I ran across one of your messages on openldap-devel entitled "contextCSN interaction between syncrepl and syncprov" at this location:

This haven't been implemented yet, but a configuration like the one you have should still work if properly configured.

I was wondering if the fix you are discussing has made it into the OpenLDAP source anywhere. I think I may be experiencing the same issue. Here is my configuration:
            Site 1                          Site 2
            ------                          ------
           Master for                      Master for
   ou=site1,dc=example,dc=org      ou=site2,dc=example,dc=org
           Slave for                       Slave for
       dc=example,dc=org               dc=example,dc=org
      (inlcuding ou=site2)            (including ou=site1)
               \                               /
                \                             /
                  ----------     ------------
                             \ /
                           Main Site
                           Master for
                           Slave for
                           Slave for

Assuming all the databases are glued together under dc=example,dc=org, this is a configuration similar to the one we use, and which is tested in test058. There are comments in that test scripts that describes the configuration. To succeed, you must ensure that:

Syncrepl on site1 and site2 is configured on dc=example,dc=org. The rootdn used on this backend must differ from the rootdn on the locally mastered backend, other backend should use the same rootdn as the top dc=example,dc=org. I.e, on site1, all backends *except* ou=site1 should have the same rootdn.

On the master, syncrepl must be configured on each of the ou=site1 and ou=site1 backends, *not* the superior dc=example,dc=org. ACL rules should be used to prevent the user syncrepl on site1 authenticates as from reading the ou=site1 backend it itself is master for, and similar for site2.

Hmm, I'm not sure I understand this point fully. Could you try stating it another way so my feeble brain can have another go?

Finally, it is *vital* that differing serverID directive values are used on each of the servers! Nor should any of them have a serverID of 0. Look at the contextCSN attribute (should be in dc=example,dc=org). You should have three values with differing serverID field (second to last).

Other than the ACLs above, the configuration you have described is identical to what I have configured and is now working (see two comments below for my mistake). I am about to extend my tests by adding sites 3, 4, and 5. I'll let you know how it goes.

Oh, and a final final note :-) Only the main site can be master for more than one backend, site1 and site2 are restricted to only be master for ou=site1 and ou=site2 respectively. This is the main problem which the message you referred to outlines a fix for.

You have confirmed what I thought was the case from reading your emails and the bug report. I emailed you initially because I didn't feel that my particular setup should trigger what you described and I was afraid my understanding was incomplete.

During a large initial import, ou=site1 is properly replicated over to Site 2 but ou=site2 is *not* properly replicated to Site 1 (or the other way around, depending on the order items are added).

Provided your configuration is as outlined above, *and* the initial replication is not interrupted but runs to its completion, it should succeed. Oh, if you slapadd ldif data, this must already include the operational attributes of the entries (entryUUID etc), or it must be imported only on the server which is the master for the data. site1 and site2 needs the top-level dc=example,dc=org entry in their database before importing the backend they are master for, so that they have somewhere to store their contextCSN attribute. You should also ensure that a contextCSN with the servers local serverID have been stored before starting replicating from them (any modification to their locally mastered backend should fix that).

I was wrong. I was impatient and did not allow the daemons enough time to sync. I have learned to be patient and the replication has worked every time since. I am using ldapadd to populate the tree online and the data is about 3MB.

Thank you for the assistance.


Tim Stewart
Stoo Research