[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#4589) syncrepl DELs sometimes don't propagate



Full_Name: Aaron Richton
Version: 2.3.24
OS: Solaris 9
URL: 
Submission from: (NULL) (128.6.31.135)


Something not quite right with session log or nonpresent list or.......

Pretty much the same as the last time I played the syncrepl game, except this
time instead of entries missing from the slave, there are excessive entries on
the slave. Here's an example, although it's hardly illuminating:

[...]
Jun  8 15:18:09 master.rutgers.edu slapd[29239]: [ID 379477 local4.debug]
conn=361133 op=2 DEL dn="uid=VICTIM,ou=People,dc=rci,dc=rutgers,dc=edu"
[...]
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 848112 local4.debug]
conn=361315 fd=100 ACCEPT from IP=192.168.4.67:53492 (IP=0.0.0.0:389)
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 560212 local4.debug]
conn=361315 op=0 STARTTLS
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 875301 local4.debug]
conn=361315 op=0 RESULT oid= err=0 text=
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 105384 local4.debug]
conn=361315 fd=100 TLS established tls_ssf=256 ssf=256
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 215403 local4.debug]
conn=361315 op=1 BIND dn="cn=readEverything,dc=rci,dc=rutgers,dc=edu"
method=128
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 600343 local4.debug]
conn=361315 op=1 BIND dn="cn=readEverything,dc=rci,dc=rutgers,dc=edu"
mech=SIMPLE ssf=0
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 588225 local4.debug]
conn=361315 op=1 RESULT tag=97 err=0 text=
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 469902 local4.debug]
conn=361315 op=2 SRCH base="dc=rci,dc=rutgers,dc=edu" scope=2 deref=0
filter="(objectClass=*)"
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 744844 local4.debug]
conn=361315 op=2 SRCH attr=* +
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 545199 local4.debug] Entry
uid=BYSTANDER,ou=People,dc=rci,dc=rutgers,dc=edu CSN
20060608190309Z#000000#00#000000 matches ctx 20060608190309Z#000000#00#000000
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 167594 local4.debug]
conn=361315 op=2 SEARCH RESULT tag=101 err=0 nentries=0 text=
Jun  8 15:22:44 slave5.rutgers.edu slapd[12966]: [ID 496601 local4.debug]
do_syncrep2: LDAP_RES_SEARCH_RESULT
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 218904 local4.debug]
conn=361315 op=3 UNBIND
Jun  8 15:22:44 master.rutgers.edu slapd[29239]: [ID 952275 local4.debug]
conn=361315 fd=100 closed

and at the end of this, slave5 still has uid=VICTIM, which is undesirable.
slave5 is on "stats sync" debug. I think I should see a del_nonpresent in there,
no? So I'd assume the master is telling it something wrong. Not sure what else I
can do. I have:

-d -1 from slave5 (HUGE, running for weeks. I can try to slim it down.)
"stats sync" syslogs from all slaves
"stats sync" syslogs from the master

Everything's still running as it was/is, so if you want to dbx some syncprov
struct or something like that, I can do that. Of course it might not be in scope
still, but I can always delete more! (Although I'm fairly certain this isn't
100% of the time. I'll look harder. Might be like last time, need a certain
number of ops per sync or similar.)