[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4423) syncrepl'd MODs cause DELs



> No, leave it in, and leave the debug statements in. Maybe add the
> DEBUG_SYNC flag to your production runs for a while till the next time
> it occurs there.

DEBUG_SYNC is on across the cluster, slave5 is set up -d -1.
As was the case on my posts to -devel, TWO sync cycles are key.

/* the MOD is OK. */
Apr 29 16:43:03 master.rutgers.edu slapd[29488]: [ID 249368 local4.debug] conn=96107 op=8 MOD dn="uid=XXX,ou=People,dc=eden,dc=rutgers,dc=edu"
Apr 29 16:43:03 master.rutgers.edu slapd[29488]: [ID 396994 local4.debug] conn=96107 op=8 MOD attr=sambaSID sambaPrimaryGroupSID displayName sambaPwdCanChange sambaPwdMustChange sambaLMPassword sambaNTPassword sambaPwdLastSet sambaAcctFlags objectclass
Apr 29 16:43:03 master.rutgers.edu slapd[29488]: [ID 588225 local4.debug] conn=96107 op=8 RESULT tag=103 err=0 text=

/* deleted and inserted. life is happy. identical logs from all slaves. */
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 151447 local4.debug] syncrepl_del_nonpresent: be_delete uid=XXX,ou=People,dc=eden,dc=rutgers,dc=edu (0)
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 801593 local4.debug] syncrepl_entry: LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 384857 local4.debug] syncrepl_entry: inserted UUID 19e9f072-98b4-1029-9a56-ac51709f7bdd
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 181434 local4.debug] syncrepl_entry: be_search (0)
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 260518 local4.debug] syncrepl_entry: uid=XXX,ou=People,dc=eden,dc=rutgers,dc=edu
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 181434 local4.debug] syncrepl_entry: be_add (0)
Apr 29 16:43:35 slave3.rutgers.edu slapd[5860]: [ID 496601 local4.debug] do_syncrep2: LDAP_RES_SEARCH_RESULT

/* on to the next op. a different ADD is OK. */
Apr 29 16:48:05 master.rutgers.edu slapd[29488]: [ID 585459 local4.debug] conn=96288 op=2 ADD dn="uid=YYY,ou=People,dc=eden,dc=rutgers,dc=edu"
Apr 29 16:48:06 master.rutgers.edu slapd[29488]: [ID 588225 local4.debug] conn=96288 op=2 RESULT tag=105 err=0 text=

/* time to replicate! identical logs from all slaves again. */
Apr 29 16:48:35 master.rutgers.edu slapd[29488]: [ID 545199 local4.debug] Entry uid=XXX,ou=People,dc=eden,dc=rutgers,dc=edu CSN 20060429204303Z#000000#00#000000 matches ctx 20060429204303Z#000000#00#000000
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 968320 local4.debug] do_syncrep2: LDAP_RES_INTERMEDIATE - SYNC_ID_SET
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 151447 local4.debug] syncrepl_del_nonpresent: be_delete uid=XXX,ou=People,dc=eden,dc=rutgers,dc=edu (0)
/* and we're screwed. */
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 801593 local4.debug] syncrepl_entry: LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD)
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 384857 local4.debug] syncrepl_entry: inserted UUID 33d368c4-6c0d-102a-8f8d-7734538fd682
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 181434 local4.debug] syncrepl_entry: be_search (0)
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 260518 local4.debug] syncrepl_entry: uid=YYY,ou=People,dc=eden,dc=rutgers,dc=edu
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 181434 local4.debug] syncrepl_entry: be_add (0)
Apr 29 16:48:35 slave3.rutgers.edu slapd[5860]: [ID 496601 local4.debug] do_syncrep2: LDAP_RES_SEARCH_RESULT

/* now no slaves have the entry. to wit: */
Apr 29 17:08:58 slave3.rutgers.edu slapd[5860]: [ID 469902 local4.debug] conn=248891 op=1515 SRCH base="dc=rutgers,dc=edu" scope=2 deref=0 filter="(&(|(objectClass=posixAccount))(|(uid=XXX)))"
Apr 29 17:08:58 slave3.rutgers.edu slapd[5860]: [ID 744844 local4.debug] conn=248891 op=1515 SRCH attr=uid cn loginShell uidNumber gidNumber shadowLastChange shadowExpire
Apr 29 17:08:58 slave3.rutgers.edu slapd[5860]: [ID 167594 local4.debug] conn=248891 op=1515 SEARCH RESULT tag=101 err=0 nentries=0 text=


-d -1 from slave5 during uid=YYY replication:

[slave5 asks]
  0000:  30 81 9a 02 01 03 63 3f  04 19 64 63 3d 65 64 65   0.....c?..dc=ede
  0010:  6e 2c 64 63 3d 72 75 74  67 65 72 73 2c 64 63 3d   n,dc=rutgers,dc=
  0020:  65 64 75 0a 01 02 0a 01  00 02 01 00 02 01 00 01   edu.............
  0030:  01 00 87 0b 6f 62 6a 65  63 74 63 6c 61 73 73 30   ....objectclass0
  0040:  06 04 01 2a 04 01 2b a0  54 30 52 04 18 31 2e 33   ...*..+.T0R..1.3
  0050:  2e 36 2e 31 2e 34 2e 31  2e 34 32 30 33 2e 31 2e   .6.1.4.1.4203.1.
  0060:  39 2e 31 2e 31 04 36 30  34 0a 01 01 04 2c 63 73   9.1.1.604....,cs
  0070:  6e 3d 32 30 30 36 30 34  32 39 32 30 34 33 30 33   n=20060429204303
  0080:  5a 23 30 30 30 30 30 30  23 30 30 23 30 30 30 30   Z#000000#00#0000
  0090:  30 30 2c 72 69 64 3d 31  30 31 01 01 ff            00,rid=101...

[first response]
ldap_read: want=52, got=52
  0000:  18 31 2e 33 2e 36 2e 31  2e 34 2e 31 2e 34 32 30   .1.3.6.1.4.1.420
  0010:  33 2e 31 2e 39 2e 31 2e  34 81 19 a3 17 01 01 ff   3.1.9.1.4.......
  0020:  31 12 04 10 19 e9 f0 72  98 b4 10 29 9a 56 ac 51   1......r...).V.Q
  0030:  70 9f 7b dd                                        p.{.
[there's poor uid=XXX. second response is uid=YYY of course]


9.1.1.604....,csn=20060429204303Z#000000#00#000000,rid=101... from trace
looks good vs.    20060429204303Z#000000#00#000000 from master log and
OK vs. entryCSN:  20060429204303Z#000000#00#000000 from ldapsearch
and I'm sure you now see how this is a bit confusing to me.