[Date Prev][Date Next] [Chronological] [Thread] [Top]

RE: race condition in -lldap/openssl??



This is most likely a bug in OpenSSL 0.9.6b, you should try again with
OpenSSL 0.9.6c before chasing this any further. The CHANGES file for 0.9.6c
specifically mentions some race conditions that are fixed since the 'b'
version.

  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

> -----Original Message-----
> From: owner-openldap-devel@OpenLDAP.org
> [mailto:owner-openldap-devel@OpenLDAP.org]On Behalf Of Dax Kelson
> Sent: Thursday, February 07, 2002 1:50 AM
> To: Kurt D. Zeilenga
> Cc: nalin@redhat.com; openldap-devel@OpenLDAP.org; lukeh@PADL.COM
> Subject: race condition in -lldap/openssl??
>
>
> This is unrelated to my previous NetBSD problem (now fixed, my error).
>
> Executive summary:
>
> I'm having a problem where two RHL7.2 LDAP clients out of many don't
> authenticate against an OpenLDAP server.
>
> In openldap-2.0.21/libraries/libldap/tls.c line ~625
>
> err = SSL_connect( ssl );
>
> If the failing client is "slightly bogged down by ltracing the sshd
> process", then err == 1 (sucess), otherwise err == 0 (failure), checking
> SSL_get_error I get SSL_ERROR_SYSCALL.
>
> The man page says:
>
>  SSL_ERROR_SYSCALL
>            Some I/O error occurred.  The OpenSSL error queue may contain
> 	   more information on the error.  If the error queue is empty
> 	   (i.e. ERR_get_error() returns 0), ret can be used to find out
> 	   more about the error: If ret == 0, an EOF was observed that
>            violates the protocol.
>
> The box is SMP dual Pentium III box, running Red Hat Linux 7.2 fully
> updated with all official errata, plus the latest pam/nss_ldap, OpenLDAP
> 2.0.21, OpenSSL 0.9.6b.  I'm also having, what appears to be, the same
> problem on another box, which is single cpu AMD 1700+.
>
> The Red Hat OpenSSL RPM was configured/built with:
>
> ./config no-asm 386 no-idea no-mdc2 no-rc5 shared
>
> The OpenLDAP RPM configured/built with:
>
> CPPFLAGS="-I/usr/kerberos/include"; export CPPFLAGS
> CFLAGS="$CPPFLAGS $RPM_OPT_FLAGS -D_REENTRANT -DHAVE_KERBEROS_V -fPIC";
> export CFLAGS
>
> %configure \
>         --with-slapd --with-slurpd --without-ldapd \
>         --with-threads=posix --enable-shared --enable-static \
>         --enable-ldbm --with-ldbm-api=gdbm \
>         --enable-passwd \
>         --enable-shell \
>         \
>         --enable-local --enable-cldap --disable-rlookups \
>         \
>         --with-kerberos=k5only \
>         --with-tls \
>         --with-cyrus-sasl \
>         \
>         --enable-wrappers \
>         \
>         --enable-cleartext \
>         --enable-crypt \
>         --enable-kpasswd \
>         --enable-spasswd \
>         \
>         --libexecdir=%{_sbindir} \
>         --localstatedir=/%{_var}/run
>
> Details:
>
> I've tracked it down closely.  I'm now officially "over my head" (tm).
> Keep in mind, I'm just a Perl guy.
>
> pam_ldap.so calls ldap_start_tls_s. I tracked that down to:
>
> openldap-2.0.21/libraries/libldap/tls.c
>
> Eventually the ldap_int_tls_connect function is called.
>
> The important lines from this function are:
>
> ssl = alloc_handle( ctx );
> err = SSL_connect( ssl );
>
> Then the existing code does:
>
> if ( err <= 0 ) {
> 	blah
>
>
> I've modified it by adding this code right above it:
>
>  if ( err == 0 ) {
>                 syslog (LOG_ERR, "SSL_connect returned 0\n");
>                 switch(SSL_get_error(ssl, err)) {
>
>                         case SSL_ERROR_NONE:
>                                 syslog (LOG_ERR, "SSL_ERROR_NONE\n");
>                                 break;
>                         case SSL_ERROR_ZERO_RETURN:
>                                 syslog (LOG_ERR,
> "SSL_ERROR_ZERO_RETURN\n");
>                                 break;
>                         case SSL_ERROR_WANT_READ:
>                                 syslog (LOG_ERR, "SSL_ERROR_WANT_READ\n");
>                                 break;
>                         case SSL_ERROR_WANT_WRITE:
>                                 syslog (LOG_ERR,
> "SSL_ERROR_WANT_WRITE\n");
>                                 break;
>                         case SSL_ERROR_WANT_CONNECT:
>                                 syslog (LOG_ERR,
> "SSL_ERROR_WANT_CONNECT\n");
>                                 break;
>                         case SSL_ERROR_WANT_X509_LOOKUP:
>                                 syslog (LOG_ERR,
> "SSL_ERROR_WANT_X509_LOOKUP\n");
>                                 break;
>                         case SSL_ERROR_SYSCALL:
>                                 syslog (LOG_ERR, "SSL_ERROR_SYSCALL\n");
>                                 break;
>                         case SSL_ERROR_SSL:
>                                 syslog (LOG_ERR, "SSL_ERROR_SSL\n");
>                                 break;
>                         default:
>                                 syslog (LOG_ERR, "Error in
> reading SSL handle\n");
>                 }
>         }
>
> SSH attempt (sucessful BTW) into the machine slightly bogged down:
>
> Feb  7 02:04:33 mooru sshd[17186]: SSL_connect returned 1
>
> SSH attempt into the machine not bogged down:
>
> Feb  7 02:12:18 mooru sshd[19396]: SSL_connect returned 0
> Feb  7 02:12:18 mooru sshd[19396]: SSL_ERROR_SYSCALL
> Feb  7 02:12:18 mooru sshd[19396]: TLS: can't connect. (other
> debug I added)
> Feb  7 02:12:18 mooru sshd[19396]: pam_ldap: ldap_starttls_s:
> Connect error
>
> At this point, I am at a loss how to further debug/diagnosis it. I'm more
> than happy to test out patches though.
>
> Dax Kelson
>