[Date Prev][Date Next]
RE: A question of replication robustness.
> > In an ideal world, S1 would somehow contact M at restart time to check
> > for updates (if it knows about M, say through an updateref or the
> > like...). (it would also be nice if the slave knew enough to get a full
> > copy of the database from a cold start. But that might require a lot
> > more intelligence.)
> Failed updates are stored in the hostname:port.rej file. `man slurpd` to
> find out how to force it to then perform those updates manually from the
> reject file.
Yes. I'm aware of that. That's what I was talking about in terms of doing
some sort of monitor that one shots these. But it seems somewhat...
suboptimal for everyone to have to reimplement this functionality. It seems
like a somewhat nasty failure model to have tree inconsistency after a node
outage. Obviously some failures aren't recoverable without human
intervention. But the case of a dead node should be.
This does however confirm my understanding that slurpd does not support any
sort of robust replication. My point was more along the lines of: This is a
really bad way of doing things if I understand it correctly, not "I don't
know how to handle this."
I don't know about you, but I don't relish going in and having to manually
one shot each rejected change set if an LDAP server fails - especially when
there isn't much good reason for it. DNS, for example, handles this far more
gracefully AND with relatively low latency if you use NOTIFY events (and
have your zone properly configured.) I don't see good reason in the RFCs to
not use a distribution method that's more pull based. Was there a good
technical argument, or is it due to the historical basis on UMich LDAP?
>From my experience with replication, I've found that it's one of the places
things are most likely to go afoul - has any thought been put into improving
it? (I am not meaning this in a critical sense. I'm actually curious. This
thread is probably quickly diverging towards the dev list, though...)