[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#9015) Replication goes haywire querying promoted master



On Tue, Apr 23, 2019 at 11:28:40PM +0000, quanah@symas.com wrote:
> --On Tuesday, April 23, 2019 8:56 PM +0200 Ond=C5=99ej Kuzn=C3=ADk=20
> <ondra@mistotebe.net> wrote:
>> Going by what I think I remember of the consumer code did:
>> - on set up, it finds out there's no cookie to go by so it goes into
>>   refresh on the main DB
>> - main DB responds with success but no/empty cookie
>> - consumer starts over but again finds itself with no cookie, so it goes
>>   to step one
>>
>> But the consumer is actually up to date at that point as the search
>> suggested, so it should just go ahead and do the refreshAndPersist on
>> accesslog as it planned to originally. And as operations hit the main
>> DB, they will replicate accordingly, even if that were to happen after
>> the original search and before this one[0].
> 
> The consumer did not replicate the database as a part of the initial
> search, so I don't think it qualifies as "up to date".  I.e., it's primary
> DB remains empty.

Ah, that wasn't mentioned, my assumption was that the source DB was
actually empty.

This is coming from ITS#8281 fix in cd8ff37629012c1676ef79de164a159da9b2ae89.

There are two scenarios that need different treatment:
1. We start with no servers and no data and want to set up MMR from there
2. We start with a live server and want to move over to MMR

The bootstrap scenario is the one I outlined in my response above, the
consumers should be able to deal with it correctly if they do what I
say. However the provider should make it possible for the client to know
that's what's happening, so it should only send an empty cookie-less
response in the case DB is empty.

The conversion to MMR should be dealt with differently. AFAIK, the
correct order of set up is:
- prepare original master:
  - serverID == 0
  - add syncprov
  - change serverID accordingly
- connect other servers:
  - set serverIDs
  - add olcSyncrepl everywhere (including original master)
- wait for things to settle, you're done

I agree it is necessary we make sure not following the above still
results either with a helpful error or that it also Just Works(tm).

Regards,

-- 
OndÅ?ej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP