[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#6639) syncrepl failing using SASL/GSSAPI




--On Thursday, September 09, 2010 10:43:06 PM -0700 Howard Chu <hyc@symas.com> wrote:

> whm@stanford.edu wrote:
>> --On Friday, September 03, 2010 01:23:17 AM -0700 Bill MacAllister<whm@stanford.edu>  wrote:
>>
>> The problem with the database was only coincidental.  Restoring the database
>> got the failing replica past the problem replication event.
>>
>> In the replica pool of 6 servers we have seen the problem on there of the
>> servers.  In thinking about this more it is unlikely that it is a slave
>> problem since the slaves have been in use for about 6 weeks and we did
>> not see the problem.  Only when we changed the master to 2.4.23 did we
>> see the problem.  I have captured a master debug log of the problem
>> event.  It is at http://www.stanford.edu/~whm/files/master-debug.txt.
>>
>> Bill
>>
> Please try with this patch:
>
> Index: sasl.c
> ===================================================================
> RCS file: /repo/OpenLDAP/pkg/ldap/libraries/libldap/sasl.c,v
> retrieving revision 1.79
> diff -u -r1.79 sasl.c
> --- sasl.c	13 Apr 2010 20:17:56 -0000	1.79
> +++ sasl.c	10 Sep 2010 05:42:22 -0000
> @@ -733,8 +733,9 @@
>   		return ret;
>   	} else if ( p->buf_out.buf_ptr != p->buf_out.buf_end ) {
>   		/* partial write? pretend nothing got written */
> -		len2 = 0;
>   		p->flags |= LDAP_PVT_SASL_PARTIAL_WRITE;
> +		sock_errset(EAGAIN);
> +		len2 = -1;
>   	}
>
>   	/* return number of bytes encoded, not written, to ensure

Howard,

The patched packages where installed last night on the production OpenLDAP
master with two of the replicas in the failing state.  Once the patched
slapd was started the two problem replicas quickly caught up and everything
looks good now.

Thanks again for your help.

Bill

-- 

Bill MacAllister
Infrastructure Delivery Group, Stanford University