[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Some interesting thoughts



Howard Chu wrote:


That seems simple enough to do (except for the "fast" part. We definitely have work to do on speeding up indexing, and Jong has a number of items addressing that). Of course, unambiguously identifying a replica to update requires more than just a hostname, but that's a minor detail. Also, I have to wonder why such a feature is needed in a properly operating replication environment.

It isn't. It's designed for initializing a new empty replica. The rate at which changes can be propagated normally is roughly the same as the rate that an LDAP client could add entries (say 50/second or so). The rate that entries can be handled via the fast initialization path is an order of magnitude faster (say 1000/second or so). When you have a couple million entries it makes a big difference to the operator experience if you can initialize a new replica in a few hours vs days and days.

For syncrepl, a freshly installed consumer will naturally pull down a complete copy of the database as its first action, so "initialization" should not be an issue. And if the replication mechanism is doing its job correctly, then "re-initialization" is not an issue.

Right. Sounds like you don't need this feature if you already have a working replica initialization process.

I suppose that could save some time. This is something I've been wondering about (in the context of "centipede") as well, coming up with a portable representation of an index that can be easily distributed. This would allow sparse replicas to fully evaluate search filters locally, then request specific matching entries from the remote masters. (The issue is most apparent in our translucent overlay, but probably has wider application.) If you represent an index as a set of key/value pairs, it's pretty obvious that the values must be entryUUIDs, but what would the keys be? Just AVAs? I guess the problem is that a portable representation would be just as voluminous as the original entry data, thus defeating some of the point of a "sparse" replica.

Also watch out for big/little endian issues in the BDB metadata.