[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Issues with syncrepl and mirrormode



> Pierangelo Masarati wrote:

> This is probably an artifact of syncprov_qresp() which always allocates at
> least one byte for the queued CSN, thus the bv_val is never NULL. Which is
> why
> the test in compose_sync_cookie() was supposed to check BVISEMPTY, not
> BVISNULL.

I still strongly suspect that the BVISEMPTY test was just hiding a can of
worms.  I was running a sequence of add/delete on server 1; I'll call
producer the server that receives the sequence of add/delete, and consumer
the one that is replicated; however the trouble is caused when they act
the opposite: it's syncprov on the consumer that sends bogus stuff to
syncrepl on the producer.  With yesterday's fixes, and after making sure
each server was recognizing itself with the right SID (1 and 2), I caught
the consumer sending back a bad cookie when playing the log.  In any case,
there appears to be an erratic behavior when performing deletes.  What I
infer from the logs is that the consumer sends back modifications to the
producer with an incorrect cookie (sid=000), and either does not perform
deletes, or does not perform them in time, and this causes trouble.  I
attach sanitized logs of both producer and consumer, in case they help,
and I'll keep investigating.

p.



Ing. Pierangelo Masarati
OpenLDAP Core Team

SysNet s.r.l.
via Dossi, 8 - 27100 Pavia - ITALIA
http://www.sys-net.it
---------------------------------------
Office:  +39 02 23998309
Mobile:  +39 333 4963172
Email:   pierangelo.masarati@sys-net.it
---------------------------------------


	# The test starts...

Oct  3 11:32:27 producer slapd[6744]: conn=4 op=1 ADD dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093228.079707Z#000000#000#000000 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=1 RESULT tag=105 err=0 text= 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=2 DEL dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.288430Z#000000#000#000000 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=2 RESULT tag=107 err=0 text= 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=3 ADD dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=3 RESULT tag=105 err=0 text= 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.355817Z#000000#000#000000 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=4 DEL dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.404017Z#000000#000#000000 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=4 RESULT tag=107 err=0 text= 

	# ... here the consumer starts sending back stuff; note the sid=000 !?!

Oct  3 11:32:30 producer slapd[6744]: conn=4 op=9 ADD dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.714165Z#000000#000#000000 
Oct  3 11:32:30 producer slapd[6744]: do_syncrep2: cookie=rid=002,sid=000 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.780601Z#000000#000#000000 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=9 RESULT tag=105 err=0 text= 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=10 DEL dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: syncrepl_entry: rid=2 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) 
Oct  3 11:32:30 producer slapd[6744]: syncrepl_entry: rid=2 be_search (0) 
Oct  3 11:32:31 producer slapd[6744]: syncrepl_entry: rid=2 ou=Test,dc=sys-net,dc=it 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=11 ADD dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.966140Z#000000#000#000000 
Oct  3 11:32:31 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093230.966140Z#000000#000#000000 
Oct  3 11:32:31 producer slapd[6744]: conn=4 op=11 RESULT tag=105 err=68 text= 
Oct  3 11:32:31 producer slapd[6744]: conn=4 op=12 DEL dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:30 producer slapd[6744]: conn=4 op=10 RESULT tag=107 err=0 text= 
Oct  3 11:32:31 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093231.221588Z#000000#000#000000 
Oct  3 11:32:31 producer slapd[6744]: syncrepl_entry: rid=2 be_add (0) 
Oct  3 11:32:31 producer slapd[6744]: conn=4 op=12 RESULT tag=107 err=0 text= 

	# ... and finally the crash; note the csn= !?!

Oct  3 11:32:37 producer slapd[6744]: conn=4 op=65 ADD dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:37 producer slapd[6744]: syncprov_sendresp: cookie=rid=001,sid=001,csn=20071003093237.815721Z#000000#000#000000 
Oct  3 11:32:37 producer slapd[6744]: conn=4 op=66 DEL dn="ou=Test,dc=sys-net,dc=it" 
Oct  3 11:32:37 producer slapd[6744]: conn=4 op=65 RESULT tag=105 err=0 text= 
Oct  3 11:32:37 producer slapd[6744]: do_syncrep2: rid=2 LDAP_RES_INTERMEDIATE - SYNC_ID_SET 
Oct  3 11:32:37 producer slapd[6744]: do_syncrep2: cookie=rid=002,sid=000,csn= 

	# game over...

	# The consumer sees some activity...

Oct  3 11:32:30 consumer slapd[4353]: do_syncrep2: cookie=rid=001,sid=001,csn=2
0071003093228.079707Z#000000#000#000000 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 LDAP_RES_SEARCH_ENT
RY(LDAP_SYNC_ADD) 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 be_search (0) 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 ou=Test,dc=sys-net,dc=it 

	# note here sending response with sid=000 !?!

Oct  3 11:32:30 consumer slapd[4353]: syncprov_sendresp: cookie=rid=002,sid=000
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 be_add (0) 
Oct  3 11:32:30 consumer slapd[4353]: do_syncrep2: cookie=rid=001,sid=001,csn=2
0071003093230.288430Z#000000#000#000000 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_DELETE) 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 be_search (0) 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 (null) 
Oct  3 11:32:30 consumer slapd[4353]: do_syncrep2: cookie=rid=001,sid=001,csn=2
0071003093230.355817Z#000000#000#000000 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 LDAP_RES_SEARCH_ENTRY(LDAP_SYNC_ADD) 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 be_search (0) 
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 ou=Test,dc=sys-net,dc=it 

	# apparently the delete did not succeed !?! (again the bogus sid)

Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 be_add (68) 
Oct  3 11:32:30 consumer slapd[4353]: syncprov_sendresp: cookie=rid=002,sid=000
Oct  3 11:32:30 consumer slapd[4353]: syncrepl_entry: rid=1 be_modify (0) 
Oct  3 11:32:31 consumer slapd[4353]: do_syncrep2: cookie=rid=001,sid=001,csn=2
0071003093230.404017Z#000000#000#000000 
Oct  3 11:32:31 consumer slapd[4353]: syncrepl_entry: rid=1 LDAP_RES_SEARCH_ENT
RY(LDAP_SYNC_DELETE) 
Oct  3 11:32:31 consumer slapd[4353]: syncrepl_entry: rid=1 be_search (0) 
Oct  3 11:32:31 consumer slapd[4353]: syncrepl_entry: rid=1 (null) 

	# ... a sync request comes in

Oct  3 11:32:37 consumer slapd[4353]: conn=2 op=1 SRCH
base="dc=a,dc=prod,dc=ac
talis" scope=2 deref=0 filter="(objectClass=*)" 
Oct  3 11:32:37 consumer slapd[4353]: conn=2 op=1 SRCH attr=* + 
Oct  3 11:32:37 consumer slapd[4353]: srs csn 20071002095656.101415Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093230.355817Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093230.451608Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093230.591655Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093230.780601Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093231.378344Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093232.312808Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093232.633093Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093232.856402Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093233.091760Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093233.266701Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093233.514740Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093233.756128Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093233.925149Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093234.179124Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093234.384376Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093234.601645Z#000000#000#000000 
Oct  3 11:32:37 consumer slapd[4353]: log csn 20071003093234.855526Z#000000#000#000000 

	# ... notice here, where the bogus cookie is generated

Oct  3 11:32:37 consumer slapd[4353]: syncprov_playlog: cookie=rid=002,sid=000,csn= 

	# ... the line below refers to an entry that was not touched
	# by this test, btw !?!

Oct  3 11:32:37 consumer slapd[4353]: Entry cn=xxxxxx,dc=sys-net,dc=it changed by peer, ignored