[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: (ITS#6798) Mutex starvation on two-level referral for SASL connection
- To: openldap-its@OpenLDAP.org
- Subject: Re: (ITS#6798) Mutex starvation on two-level referral for SASL connection
- From: hyc@symas.com
- Date: Tue, 1 Feb 2011 21:32:34 GMT
- Auto-submitted: auto-generated (OpenLDAP-ITS)
Timo Aaltonen wrote:
> On Thu, 20 Jan 2011, Howard Chu wrote:
>
>> timo.aaltonen@aalto.fi wrote:
>>> Hi
>>>
>>> Here's some information that Stephen asked would be of use. There is
>>> one forest, one domain, but three sites in the layout. The functional
>>> level of the forest and the domain is W2008, but the servers have 2008R2.
>>>
>>> And the full backtrace of the hung process:
>>
>>> #3 0x00007f8f652f3bcb in ldap_pvt_thread_mutex_lock
>>> (mutex=0x7f8f6553fc80)
>>> at /tmp/buildd/openldap-2.4.23/libraries/libldap_r/thr_posix.c:296
>>> No locals.
>>> #4 0x00007f8f653010bf in ldap_sasl_interactive_bind_s (ld=0x2117c20,
>>> dn=0x0,
>>> mechs=0x210d530 "GSSAPI", serverControls=0x0, clientControls=0x0,
>>> flags=2,
>>> interact=0x7f8f61405120<sdap_sasl_interact>, defaults=0x2124a50) at
>>> sasl.c:426
>>> rc = -1921681294
>>> smechs = 0x0
>>
>> This particular mutex seems kind of bogus to me; the code is from rev 1.31 in
>> June 2001. Perhaps back then it was unsafe to have multiple SASL operations
>> outstanding at once; I would expect that was only an issue in the Cyrus 1.5
>> days and it should be safe now with Cyrus 2.x. We should probably just delete
>> this mutex.
>
> Ok, so by doing this:
>
> --- openldap-2.4.23.orig/libraries/libldap/sasl.c
> +++ openldap-2.4.23/libraries/libldap/sasl.c
> @@ -421,10 +421,11 @@
> {
> int rc;
> char *smechs = NULL;
> -
> +/*
> #if defined( LDAP_R_COMPILE )&& defined( HAVE_CYRUS_SASL )
> ldap_pvt_thread_mutex_lock(&ldap_int_sasl_mutex );
> #endif
> +*/
> #ifdef LDAP_CONNECTIONLESS
> if( LDAP_IS_UDP(ld) ) {
> /* Just force it to simple bind, silly to make the user
>
> --
>
> .. the process doesn't hang anymore. But it still doesn't do what it's
> supposed to, but that could be a bug in SSSD. I'll investigate further.
>
> Thanks!
>
As I noted in a previous followup, it's not clear to me that the Cyrus SASL
library is actually safe to use without that mutex. Also, going through your
provided backtraces, I see the real issue is that two different requests were
active at the same time. I.e., there was an active request that triggered a
referral, and an unrelated request. You would also have avoided this issue if
you waited for the request that triggered the referrals to complete before
issuing any other requests.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/