[Date Prev][Date Next] [Chronological] [Thread] [Top]

Scaling Syncrepl



In my environment I have a need to synchronize from a single master to 125 globally distributed read-only consumers.

I've attempted this in two ways and run into problems in either direction.

First, I attempted a multi-tier replication strategy where the master would sync to a regional consumer which would in-turn act as a producer for around 20 slaves each.  It seems that a server should be able to act as both a producer and a consumer, but in my experience with 2.4.25 this will cause a repeatable segfault within a days time.  (test_filter() is passed a NULL filter in syncrepl.c)  I think this would probably be the best solution if I could resolve the segfault issue.

The other option I've tried is pointing all 125 slaves at a single master.  This works if I bring the slaves up gradually, but if they all attempt to connect at once (like after a master restart) the initial sync process seems to monopolize a thread per replica which causes any other searches to fail for a period of greater than 30 seconds.  Bumping the threads up to over 125 seems to solve the issue on a test machine but I'm hesitant to do this on the production master which is used heavily for a variety of other purposes.

Can anyone offer advice on how I could go about resolving these issues or other methods for successfully replicating to this many slaves?

Thanks,
Duncan.