[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Partial replication

On Thu, Apr 01, 2010 at 09:53:07PM +0200, Zdenek Styblik wrote:

>>> you want to replicate. So, let's say you use cn=mirrorA,dc=domain,dc=tld
>>> for replication, then allow this cn=mirrorA to read only
>>> o=support,dc=example,dc=com and o=location_A,dc=example,dc=com, but nowhere
>>> else.
>> I have used that technique for a fairly complex design with a central
>> office and many small satellites. It works OK *provided* you never change
>> the list of entries that can be seen by the replicas. The syncrepl
>> system has no way to evaluate the effect of an ACL change (and probably
>> no way to know that one has happenned).
> Could you please elaborate more on this one?

My design requirements were similar to Joe's: I had a large central
server holding the master data for a lot of customers. Each customer
needed a local replica of their own data plus some subset of the
service-provider data. In my case the subset was not even complete
subtrees: the customers were allowed to see certain attributes of
certain entries only. I had to protect against the possibility that
someone might modify the config on a customer server to obtain data
that they should not have.

As there was already a comprehensive default-deny access-control
policy in place, I just factored in the replica servers as principals
with the right to see all data that should be replicated to that site
and nothing else. That meant that every replica server could have an
identical syncrepl clause which just copies everything it can see from
the entire DIT.

The downside is that if any access permissions change then the
replicas may not reflect the correct new subset of data.

> Because I'd say if you refuse 
> access later to some DN then it must be like DN has been deleted. Same goes 
> for adding. I mean, syncrepl won't see data. And it checks, well it should 
> check, for changes in some regular intervals, right?

The problem is that syncrepl does not check every entry exhaustively.
That would be very inefficient (though I would like a way to force it
periodically). The master server maintains something like a timestamp
on the whole DIT, and when the replica server connects they just have
to compare timestamps and transfer things that have changed in the
interval between the two. (This is a gross simplification of the
actual protocol, but close enough for the discussion).

Now imagine that I change an ACL which affects the visibility of some
entries. The entries themselves have not changed, so the timestamps do
not change and the replication process will not know that the replica
data should change.

Worse still, I might change the membership of a group that is
referenced in an ACL. The replication process would transfer the group
but would not know that some other entries have changed visibility.

> I have no need for nor experience with this, yet it's somewhat interesting.

It is a powerful technique, but the designer *and operators* of such a
system must be aware of the pitfalls.

> ACLs of anykind in OpenLDAP are kinda ... PITA, no offense to anybody!!! :) 
> It just needs a lot of work to maintain and stuff (please please, no 
> bashing).

ACLs of any kind in any system (LDAP, file system, RDBMS etc) can be
hard to get right and harder to modify correctly at a later date. It
all depends on the policy that you are trying to implement. You should
think of ACLs as programs and expect to need programmer-level skill to
work on them. You may find this paper helpful:


Of all the LDAP servers that I have worked with, I find OpenLDAP's
ACLs are the easiest for implementing non-trivial policies.

|                 From Andrew Findlay, Skills 1st Ltd                 |
| Consultant in large-scale systems, networks, and directory services |
|     http://www.skills-1st.co.uk/                +44 1628 782565     |