Full_Name: Ryan Tandy Version: RE24 (66d107e) OS: Debian URL: Submission from: (NULL) (24.68.41.160) Submitted by: ryan Test program uploaded to the FTP site: ftp://ftp.openldap.org/incoming/20170427_rtandy_sasltest.c I have built cyrus-sasl and openldap myself from clean sources in order to rule out issues introduced by Debian patches. My environment is Debian unstable with OpenSSL 1.0.2k. cyrus-sasl master at 04dd838 ./autogen.sh CFLAGS="-g -O0" --with-devrandom=/dev/urandom make sudo make install sudo ldconfig export SASL_PATH=/usr/local/lib/sasl2 openldap RE24 at 66d107e ./configure CFLAGS="-g -O0" --disable-backends --enable-mdb --disable-overlays --with-cyrus-sasl --with-tls=openssl make sudo make STRIP= install sudo ldconfig cat > slapd.conf << 'eof' include servers/slapd/schema/core.schema include servers/slapd/schema/cosine.schema sasl-secprops none authz-regexp uid=(.*),cn=.*,cn=auth ldap:///dc=example,dc=com??sub?(uid=$1) database mdb directory . suffix dc=example,dc=com rootdn cn=root,dc=example,dc=com index objectClass eq eof /usr/local/sbin/slapadd -f slapd.conf << eof dn: dc=example,dc=com objectClass: domain dc: example dn: uid=admin,dc=example,dc=com objectClass: account objectClass: simpleSecurityObject uid: admin userPassword: admin eof /usr/local/libexec/slapd -h ldap://:9000 -f slapd.conf -s0 -dstats # allow PLAIN export LDAPSASL_SECPROPS=none # verify authz-regexp. should return uid=admin,dc=example,dc=com ldapwhoami -H ldap://:9000 -Y PLAIN -U admin -w admin # edit sasltest.c and change defines as needed. remove -DSASL to use simple bind cc -g -DSASL sasltest.c -pthread -llber -lldap -lsasl2 -o sasltest ./sasltest On my system, most runs of ./sasltest result in errors like: rc = -6 (Unknown authentication method) sasltest: sasltest.c:70: bind_thread: Assertion `rc == LDAP_SUCCESS' failed. Aborted With no concurrency (THREADS defined to 1), I see no errors. I believe this is due to sasl_client_init being called from multiple threads concurrently. I suspect the global mech list is getting mucked with. As a proof of concept, I patched libldap to call sasl_client_init during its global init: ftp://ftp.openldap.org/incoming/20170427_rtandy_call-sasl_client_init-in-global-init.patch With this change, I see no errors. If I'm right about this bug, I suppose the right fix is to wrap sasl_client_init in a new mutex. I'll post a patch for review soon.
On Fri, Apr 28, 2017 at 04:43:33AM +0000, ryan@openldap.org wrote: >If I'm right about this bug, I suppose the right fix is to wrap sasl_client_init >in a new mutex. ... of course that idea's half-baked, because libldap doesn't *have* mutexes. hmm. I tried cyrus-sasl-2.1.25 and the issue doesn't seem to happen. I'll see if I can isolate the related change.
On Fri, Apr 28, 2017 at 02:52:44PM +0000, ryan@nardis.ca wrote: >I tried cyrus-sasl-2.1.25 and the issue doesn't seem to happen. I'll see >if I can isolate the related change. SASL version was a red herring. I accidentally linked with Debian's libldap (2.4.31) when I tested that. Mea culpa. The culprit does indeed seem to be the ITS#6798 change, as Quanah suggested. Reverting b483ee1a ("Drop ldap_int_sasl_mutex") on RE24 and linking with libldap_r makes my test program work. Since the proof-of-concept patch I posted above also makes my test program work, I'm back to thinking that calling sasl_client_init concurrently is the real bug here. >On Fri, Apr 28, 2017 at 04:43:33AM +0000, ryan@openldap.org wrote: >>If I'm right about this bug, I suppose the right fix is to wrap sasl_client_init >>in a new mutex. > >... of course that idea's half-baked, because libldap doesn't *have* >mutexes. hmm. I'm still not happy with the idea of calling sasl_client_init from ldap_int_initialize, as it does a bunch of work (loading plugins and such) that non-SASL users don't care about. However, I haven't as yet thought of another fix that works for libldap as well as libldap_r. Would love to hear others' thoughts on this. Maybe fixing for libldap_r only (and documenting that) is a good enough compromise?
ryan@nardis.ca wrote: > On Fri, Apr 28, 2017 at 02:52:44PM +0000, ryan@nardis.ca wrote: >> I tried cyrus-sasl-2.1.25 and the issue doesn't seem to happen. I'll see >> if I can isolate the related change. > > SASL version was a red herring. I accidentally linked with Debian's > libldap (2.4.31) when I tested that. Mea culpa. > > The culprit does indeed seem to be the ITS#6798 change, as Quanah > suggested. Reverting b483ee1a ("Drop ldap_int_sasl_mutex") on RE24 and > linking with libldap_r makes my test program work. > > Since the proof-of-concept patch I posted above also makes my test > program work, I'm back to thinking that calling sasl_client_init > concurrently is the real bug here. > >> On Fri, Apr 28, 2017 at 04:43:33AM +0000, ryan@openldap.org wrote: >>> If I'm right about this bug, I suppose the right fix is to wrap sasl_client_init >>> in a new mutex. >> >> ... of course that idea's half-baked, because libldap doesn't *have* >> mutexes. hmm. > > I'm still not happy with the idea of calling sasl_client_init from > ldap_int_initialize, as it does a bunch of work (loading plugins and > such) that non-SASL users don't care about. However, I haven't as yet > thought of another fix that works for libldap as well as libldap_r. > > Would love to hear others' thoughts on this. Maybe fixing for libldap_r > only (and documenting that) is a good enough compromise? Yes, if the problem only exists in concurrent use then it's only relevant to libldap_r. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed notes
changed notes changed state Open to Test moved from Incoming to Software Bugs
An Ubuntu user confirmed that this change solved their issue with slapd and multiple syncrepl clients using GSSAPI. https://bugs.launchpad.net/ubuntu/+source/openldap/+bug/1688575/comments/6
introduced by ITS#6798 fixed in master fixed in RE24 (2.4.45)
changed notes changed state Test to Closed