[Date Prev][Date Next]
Re: (ITS#8493) Under heavy modrdn load, masters desync
--On Saturday, September 03, 2016 4:51 PM +0000 firstname.lastname@example.org wrote:
> --On Saturday, September 03, 2016 6:15 AM +0000 email@example.com wrote:
>> Full_Name: Quanah Gibson-Mount
>> Version: 2.4.44+ITS8432
>> OS: Linux 2.6
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (188.8.131.52)
>> Trying to reproduce another ITS, I discovered a new bug. When doing
>> MODRDN ops on one master, the other master keeps going out of sync.
>> Sep 3 01:12:17 zre-ldap002 slapd: syncrepl_message_to_op: rid=100
>> be_modrdn uid=user.924,ou=people,dc=zre-ldap002,dc=eng,dc=zimbra,dc=com
>> (32) Sep 3 01:12:17 zre-ldap002 slapd: do_syncrep2: rid=100
>> delta-sync lost sync on (reqStart=20160903051215.747829Z,cn=accesslog),
>> switching to REFRESH
> Note that this master also has a replica. The replica never rejected a
> single one of these MODRDNs coming from this master. Which means that
> a) The data on the master spontaneously corrupted at some point
> b) The master wrote the MODRDNs to the accesslog, which the replica
> picked up, but did not itself make the MODRDN changes to its database.
> In the end, of the 50,000 MODRDNs it was processing, it threw an error 32
> for 441 of them.
After the master that was not accepting direct writes re-sync'd with the
master accepting writes, it still had 403/50000 entries wrong. So did its
replica. So the master isn't writing the changes to the accesslog. So
it's option c. The master rejects a valid op, never sync's correctly, and
in the end 2/3rds of my servers have invalid databases.
I see zero indication that using a sessionlog works around
<http://www.openldap.org/its/index.cgi/?findid=8125> at all. I still end
up with missed entries even with everything *in* the sessionlog.