[Date Prev][Date Next]
Re: Partial replication
On 04/06/10 14:55, Andrew Findlay wrote:
On Thu, Apr 01, 2010 at 09:53:07PM +0200, Zdenek Styblik wrote:
you want to replicate. So, let's say you use cn=mirrorA,dc=domain,dc=tld
for replication, then allow this cn=mirrorA to read only
o=support,dc=example,dc=com and o=location_A,dc=example,dc=com, but nowhere
I have used that technique for a fairly complex design with a central
office and many small satellites. It works OK *provided* you never change
the list of entries that can be seen by the replicas. The syncrepl
system has no way to evaluate the effect of an ACL change (and probably
no way to know that one has happenned).
Could you please elaborate more on this one?
My design requirements were similar to Joe's: I had a large central
server holding the master data for a lot of customers. Each customer
needed a local replica of their own data plus some subset of the
service-provider data. In my case the subset was not even complete
subtrees: the customers were allowed to see certain attributes of
certain entries only. I had to protect against the possibility that
someone might modify the config on a customer server to obtain data
that they should not have.
As there was already a comprehensive default-deny access-control
policy in place, I just factored in the replica servers as principals
with the right to see all data that should be replicated to that site
and nothing else. That meant that every replica server could have an
identical syncrepl clause which just copies everything it can see from
the entire DIT.
If I got your reply right, I haven't suggested otherwise than put ACLs
at provider side, not consumer.
The downside is that if any access permissions change then the
replicas may not reflect the correct new subset of data.
Because I'd say if you refuse
access later to some DN then it must be like DN has been deleted. Same goes
for adding. I mean, syncrepl won't see data. And it checks, well it should
check, for changes in some regular intervals, right?
The problem is that syncrepl does not check every entry exhaustively.
That would be very inefficient (though I would like a way to force it
periodically). The master server maintains something like a timestamp
on the whole DIT, and when the replica server connects they just have
to compare timestamps and transfer things that have changed in the
interval between the two. (This is a gross simplification of the
actual protocol, but close enough for the discussion).
Now imagine that I change an ACL which affects the visibility of some
entries. The entries themselves have not changed, so the timestamps do
not change and the replication process will not know that the replica
data should change.
Worse still, I might change the membership of a group that is
referenced in an ACL. The replication process would transfer the group
but would not know that some other entries have changed visibility.
To make it short - I take your word for it :) In other words, it's
probably done as "best" as it could be for the time of being.
I've written my assumptions and...that's probably all.
[some blabbering deleted/replaced here]
I have no need for nor experience with this, yet it's somewhat interesting.
It is a powerful technique, but the designer *and operators* of such a
system must be aware of the pitfalls.
ACLs of anykind in OpenLDAP are kinda ... PITA, no offense to anybody!!! :)
It just needs a lot of work to maintain and stuff (please please, no
ACLs of any kind in any system (LDAP, file system, RDBMS etc) can be
hard to get right and harder to modify correctly at a later date. It
all depends on the policy that you are trying to implement. You should
think of ACLs as programs and expect to need programmer-level skill to
work on them. You may find this paper helpful:
Of all the LDAP servers that I have worked with, I find OpenLDAP's
ACLs are the easiest for implementing non-trivial policies.
Well, right now all [2 :) ] replicas are 1:1 and they should maintain
the very same ACLs as provider. I know this can be managed eg. via batch
script (or dynamic ACL), still- That's what I've meant.
But I agree, it's not much better anywhere else.
I guess right fun will begin when/if I decide to replicate only data
that are really needed and of some use to [certain] consumer.