[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Multiple syncrepl problems



Aaron Richton wrote:

Darren Gamble <darren.gamble@sjrb.ca> wrote:


thought we'd give that a try. We brought up a fresh 2.2.20 consumer
against the 2.2.19 provider, but no entries get replicated at all. The


and Quanah wrote:


My own opinion is I would wait for OpenLDAP 2.3 to use syncRepl. Note that
a number of bugs I discovered in syncRepl in 2.3 were fixed in OL 2.2.20 as
well by Howard Chu, so you may want to look at upgrading.


To clarify a previous comment, the majority of syncrepl fixes in 2.2.20 are consumer-side, there is only one provider-related fix in this release and it is only to insure proper cleanup for a broken persistent connection so it won't have much visible impact in normal provider operations.


We use syncRepl in production, but I'll admit to quite a bit of teething. (This is well documented in the OpenLDAP ITS.) To blame any one bug would be misleadingly simplistic, but with that caveat, running the fix from slapd/sl_malloc.c rev 1.23 on the provider cleared our show-stopper. You're more fortunate than I was in that 2.2.20 includes that fix off the shelf.

I note that you kept your provider 2.2.19. I'd try everything as 2.2.20,
and go from there. It might also be prudent to reload your slave databases
once you've got 2.2.20 squared away, just in case any of the previous
syncRepl bugs somehow resulted in inaccurate data.

The 2.3 syncRepl improvements look interesting. Then again, my build of
2.3.0alpha failed "make test," but it looks like that might already be
cleared in HEAD. Hopefully we'll have a few more 2.3 alpha cycles that
keep syncRepl in a good working state. (Although personally, I'm now so
happy with 2.2.20, 2.3 will be a cautious move. I wouldn't have said that
a month ago.)

I think your and Quanah's usage and bug reports have been invaluable in getting things stabilized here, but I personally would not be comfortable using 2.2 syncrepl in production. I fixed enough bugs to get the self-tests passing cleanly, but the consumer code is still a mess. I found a couple inconsistencies that are likely the result of misapplied patches, leading to inexplicable differences between the 2.2 and HEAD code (i.e., discounting the expected differences due to feature changes). As much of this consumer code is still present in 2.3, I expect it will be a few more revisions before syncrepl is really usable. I know for a fact that there are certain Modify/ModDN operations that will not propagate correctly in 2.2 (these are fixed in 2.3, and it's only a problem when you're doing selective replication with filters and other constraints in effect).

I've been away from the code for a couple of weeks, so I don't know how well things are running at the moment. I see a few new ITSs already which is a bit bothersome, since it was all working when I left. Hopefully it won't take long to get it all settled.

--
 -- Howard Chu
 Chief Architect, Symas Corp.       Director, Highland Sun
 http://www.symas.com               http://highlandsun.com/hyc
 Symas: Premier OpenSource Development and Support