[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: SSL strangeness



On Thu, Oct 29, 2009 at 9:56 AM, Victor Mataré
<matare@lih.rwth-aachen.de> wrote:
> Hello,
>
> I'm seeing some really weird behaviour when using ldaps:// on an
> openldap-2.3.43 server. It's a Gentoo Linux box with glibc-2.9_p20081201-r2
> and openssl-0.9.8k. I have already recompiled the entire system with
> gcc-4.3.4 (twice to be sure), with no errors. First of all, ldapsearch -H
> ldaps://bussard.lih.rwth-aachen.de just hangs.
> The strange part: when I strace -f slapd, from the second retry on, it
> works.
>
> So I went on by debugging with openssl s_client, which exhibits just the
> same behaviour. However it reveals that slapd falls silent in the middle of
> sending the certificates.
>
> So if I do:
>
> $ openssl s_client -connect bussard.lih.rwth-aachen.de:636 -state -status
> -CAfile /etc/openldap/ssl/rwth-dfn-tcom.crt
> CONNECTED(00000003)
> SSL_connect:before/connect initialization
> SSL_connect:SSLv2/v3 write client hello A
> OCSP response: no response sent
> SSL_connect:SSLv3 read server hello A
> depth=3 /C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche
> Telekom Root CA 2
> verify return:1
> depth=2 /C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
> verify return:1
> depth=1 /C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
> verify return:1
> depth=0 /C=DE/O=RWTH Aachen/OU=Lehrstuhl fuer Ingenieur- und
> Hydrogeologie/CN=ldap.lih.rwth-aachen.de
> verify return:1
> SSL_connect:SSLv3 read server certificate A
> ^C
>
> Now after I've done "strace -f -p `pidof slapd`" on the server, I get the
> same as above once. Then when I try a second time:
>
> $ openssl s_client -connect bussard.lih.rwth-aachen.de:636 -state -CAfile
> /etc/openldap/ssl/rwth-dfn-tcom.crt
> CONNECTED(00000003)
> SSL_connect:before/connect initialization
> SSL_connect:SSLv2/v3 write client hello A
> SSL_connect:SSLv3 read server hello A
> depth=3 /C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche
> Telekom Root CA 2
> verify return:1
> depth=2 /C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
> verify return:1
> depth=1 /C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
> verify return:1
> depth=0 /C=DE/O=RWTH Aachen/OU=Lehrstuhl fuer Ingenieur- und
> Hydrogeologie/CN=ldap.lih.rwth-aachen.de
> verify return:1
> SSL_connect:SSLv3 read server certificate A
> SSL_connect:SSLv3 read server certificate request A
> SSL_connect:SSLv3 read server done A
> SSL_connect:SSLv3 write client certificate A
> SSL_connect:SSLv3 write client key exchange A
> SSL_connect:SSLv3 write change cipher spec A
> SSL_connect:SSLv3 write finished A
> SSL_connect:SSLv3 flush data
> SSL_connect:SSLv3 read finished A
> ---
> Certificate chain
>  0 s:/C=DE/O=RWTH Aachen/OU=Lehrstuhl fuer Ingenieur- und
> Hydrogeologie/CN=ldap.lih.rwth-aachen.de
>   i:/C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
>  1 s:/C=DE/O=RWTH Aachen/CN=RWTH Aachen CA/emailAddress=ca@rwth-aachen.de
>   i:/C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
>  2 s:/C=DE/O=DFN-Verein/OU=DFN-PKI/CN=DFN-Verein PCA Global - G01
>   i:/C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche
> Telekom Root CA 2
>  3 s:/C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche
> Telekom Root CA 2
>   i:/C=DE/O=Deutsche Telekom AG/OU=T-TeleSec Trust Center/CN=Deutsche
> Telekom Root CA 2
> ---
> Server certificate
> -----BEGIN CERTIFICATE-----
>
> ... (lotsa stuff) ...
>
> /C=ES/ST=Barcelona/L=Barcelona/O=IPS Internet publishing Services
> s.l./O=ips@mail.ips.es C.I.F.  B-60929452/OU=IPS CA Timestamping
> Certification Authority/CN=IPS CA Timestamping Certification
> Authority/emailAddress=ips@mail.ips.es
> ---
> SSL handshake has read 26305 bytes and written 480 bytes
> ---
> New, TLSv1/SSLv3, Cipher is AES256-SHA
> Server public key is 2048 bit
> Compression: NONE
> Expansion: NONE
> SSL-Session:
>    Protocol  : TLSv1
>    Cipher    : AES256-SHA
>    Session-ID: 99...CB400
>    Session-ID-ctx:
>    Master-Key: 152...A2D
>    Key-Arg   : None
>    Start Time: 1256667603
>    Timeout   : 300 (sec)
>    Verify return code: 0 (ok)
> ---
>
> This time the SSL handshake works, just because I'm strace'ing slapd? This
> looks like some really weird race condition. It's driving me crazy. Should I
> talk to the openssl people about this? But when I make an openssl testbed
> with openssl s_server and s_client, everything works fine, so it shouldn't
> be an openssl issue.
>
> Oh and maybe you'd like to see the strace output. This is what it looks like
> when the SSL client hangs:
>
> # strace -f -p `pidof slapd`
> Process 3339 attached with 3 threads - interrupt to quit
> [pid  3339] futex(0x8977560, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
> [pid  3338] time(NULL)                  = 1256668096
> [pid  3338] epoll_wait(6,  <unfinished ...>
> [pid  3328] futex(0xad15ebd8, FUTEX_WAIT, 3338, NULL <unfinished ...>
> [pid  3338] <... epoll_wait resumed> {{EPOLLIN, {u32=143663400,
> u64=143663400}}}, 1024, 1798000) = 1
> [pid  3338] accept(7, {sa_family=AF_INET, sin_port=htons(37192),
> sin_addr=inet_addr("137.226.164.160")}, [16]) = 14
> [pid  3338] setsockopt(14, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
> [pid  3338] setsockopt(14, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> [pid  3338] open("/etc/hosts.allow", O_RDONLY) = -1 ENOENT (No such file or
> directory)
> [pid  3338] open("/etc/hosts.deny", O_RDONLY) = -1 ENOENT (No such file or
> directory)
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] fcntl64(14, F_GETFL)        = 0x2 (flags O_RDWR)
> [pid  3338] fcntl64(14, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> [pid  3338] epoll_ctl(6, EPOLL_CTL_ADD, 14, {EPOLLIN, {u32=143700808,
> u64=143700808}}) = 0
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] epoll_wait(6, {{EPOLLIN, {u32=143700808, u64=143700808}}}, 1024,
> 1795000) = 1
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] read(14, "\200\214\1\3\1\0c\0\0\0 "..., 11) = 11
> [pid  3338] read(14,
> "\0\0009\0\0008\0\0005\0\0\210\0\0\207\0\0\204\0\0\26\0\0\23\0\0\n\7\0\300\0\0003"...,
> 131) = 131
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] write(14,
> "\26\3\1\0J\2\0\0F\3\1J\347;\303\310\247w\24<\206!\334\3345\304\327\321\344\36FG\37"...,
> 4096) = 4096
> [pid  3338] write(14,
> "\"0\r\6\t*\206H\206\367\r\1\1\1\5\0\3\202\1\17\0000\202\1\n\2\202\1\1\0\253\v\243"...,
> 4096) = 4096
> [pid  3338] write(14,
> "ootCA1\0;091\v0\t\6\3U\4\6\23\2FI1\0170\r\6\3U\4\n\23"..., 12928) = 6288
> [pid  3338] write(14, "go1,0*\6\3U\4\v\23#Wells Fargo Certific"..., 6640) =
> -1 EAGAIN (Resource temporarily unavailable)
> [pid  3338] write(14, "go1,0*\6\3U\4\v\23#Wells Fargo Certific"..., 6640) =
> 6640
> [pid  3338] write(14, "\26\3\1\24\1ck Halozatbiztonsagi Kft.1\0320"...,
> 5126) = 2048
> [pid  3338] write(14, "\4\n\23\37Software in the Public Intere"..., 3078) =
> -1 EAGAIN (Resource temporarily unavailable)
> [pid  3338] time(NULL)                  = 1256668099
> [pid  3338] epoll_wait(6,
>
> (Strg-C on the client...)
>
> {{EPOLLIN, {u32=143700808, u64=143700808}}}, 1024, 1795000) = 1
> [pid  3338] time(NULL)                  = 1256668102
> [pid  3338] time(NULL)                  = 1256668102
> [pid  3338] write(14, "\4\n\23\37Software in the Public Intere"..., 3078) =
> 3078
> [pid  3338] read(14, ""..., 5)          = 0
> [pid  3338] epoll_ctl(6, EPOLL_CTL_MOD, 14, {0, {u32=143700808,
> u64=143700808}}) = 0
> [pid  3338] write(5, "0"..., 1)         = 1
> [pid  3338] epoll_ctl(6, EPOLL_CTL_DEL, 14, {0, {u32=143700808,
> u64=143700808}}) = 0
> [pid  3338] shutdown(14, 2 /* send and receive */) = -1 ENOTCONN (Transport
> endpoint is not connected)
> [pid  3338] close(14)                   = 0
> [pid  3338] time(NULL)                  = 1256668102
> [pid  3338] epoll_wait(6, {{EPOLLIN, {u32=143700768, u64=143700768}}}, 1024,
> 1792000) = 1
> [pid  3338] read(4, "0"..., 8192)       = 1
> [pid  3338] time(NULL)                  = 1256668102
> [pid  3338] epoll_wait(6,
>
> and that's it. Now when I try for the second time (now I get the server cert
> alright), it looks like this:
>
> # strace -f -p `pidof slapd`
> Process 3354 attached with 4 threads - interrupt to quit
> [pid  3339] futex(0x8977560, FUTEX_WAIT_PRIVATE, 42, NULL <unfinished ...>
> [pid  3354] futex(0x8977560, FUTEX_WAIT_PRIVATE, 42, NULL <unfinished ...>
> [pid  3338] time(NULL)                  = 1256668222
> [pid  3338] epoll_wait(6,  <unfinished ...>
> [pid  3328] futex(0xad15ebd8, FUTEX_WAIT, 3338, NULL <unfinished ...>
> [pid  3338] <... epoll_wait resumed> {{EPOLLIN, {u32=143663400,
> u64=143663400}}}, 1024, 1672000) = 1
> [pid  3338] accept(7, {sa_family=AF_INET, sin_port=htons(37195),
> sin_addr=inet_addr("137.226.164.160")}, [16]) = 15
> [pid  3338] setsockopt(15, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
> [pid  3338] setsockopt(15, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> [pid  3338] open("/etc/hosts.allow", O_RDONLY) = -1 ENOENT (No such file or
> directory)
> [pid  3338] open("/etc/hosts.deny", O_RDONLY) = -1 ENOENT (No such file or
> directory)
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] fcntl64(15, F_GETFL)        = 0x2 (flags O_RDWR)
> [pid  3338] fcntl64(15, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> [pid  3338] epoll_ctl(6, EPOLL_CTL_ADD, 15, {EPOLLIN, {u32=143700812,
> u64=143700812}}) = 0
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] epoll_wait(6, {{EPOLLIN, {u32=143700812, u64=143700812}}}, 1024,
> 1670000) = 1
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] read(15, "\200\214\1\3\1\0c\0\0\0 "..., 11) = 11
> [pid  3338] read(15,
> "\0\0009\0\0008\0\0005\0\0\210\0\0\207\0\0\204\0\0\26\0\0\23\0\0\n\7\0\300\0\0003"...,
> 131) = 131
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] write(15,
> "\26\3\1\0J\2\0\0F\3\1J\347<@5\352%\335\336\264Q\2263\346\303\335\t\2\34\241\372Q"...,
> 4096) = 4096
> [pid  3338] write(15,
> "\"0\r\6\t*\206H\206\367\r\1\1\1\5\0\3\202\1\17\0000\202\1\n\2\202\1\1\0\253\v\243"...,
> 4096) = 4096
> [pid  3338] write(15,
> "ootCA1\0;091\v0\t\6\3U\4\6\23\2FI1\0170\r\6\3U\4\n\23"..., 12928) = 11584
> [pid  3338] write(15,
> "\6\3U\4\6\23\2AU1\0230\21\6\3U\4\10\23\nQueensland1\0210"..., 4096) = 2896
> [pid  3338] write(15, "ty1$0\"\6\3U\4\n\23\33Digital Signature Tr"..., 1200)
> = 1200
> [pid  3338] write(15, "\26\31personal-basic@thawte.com\0\3210\201\3161"...,
> 2374) = 2374
> [pid  3338] read(15, 0x8a08398, 5)      = -1 EAGAIN (Resource temporarily
> unavailable)
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] epoll_wait(6, {{EPOLLIN, {u32=143700812, u64=143700812}}}, 1024,
> 1670000) = 1
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] read(15, "\26\3\1\0\7"..., 5) = 5
> [pid  3338] read(15, "\v\0\0\3\0\0\0"..., 7) = 7
> [pid  3338] read(15, "\26\3\1\1\6"..., 5) = 5
> [pid  3338] read(15,
> "\20\0\1\2\1\0~\246\237\364\202\0\217\345#|\241\273k\34\251\277\224X\346\274\361\300\373\1\24\226\334"...,
> 262) = 262
> [pid  3338] read(15, "\24\3\1\0\1"..., 5) = 5
> [pid  3338] read(15, "\1"..., 1)        = 1
> [pid  3338] read(15, "\26\3\1\0000"..., 5) = 5
> [pid  3338] read(15,
> "\36\337\371\314\260\5\246\233\17\31^P\3027\227\333\257\374\221F\\\20?1\316\207\201BJQ\337\264\224"...,
> 48) = 48
> [pid  3338] write(15,
> "\24\3\1\0\1\1\26\3\1\0000\356\336\3673\3034w\344\3364e\264\10dP\302\205\3058\357\272c"...,
> 59) = 59
> [pid  3338] time(NULL)                  = 1256668224
> [pid  3338] epoll_wait(6,
>
>
> Hope that someone can make sense of this. Just to be clear: ldapsearch
> behaves the same way as described above for openssl s_client.
>
> Thank you very much for even reading so far.
>
>
>

I am sure components of openssh-lpk only want to use TLS over ldap. I
read a PDF where they suggest ldaps is 'deprecated'. I have found that
each component sudo, openssh-ldp, nsswitch seems to parse
/etc/ldap.conf a different way.
Some like ldaps, some don't. some like uri, some don't. I see it
getting better but its not clean to the point everything works every
which way.

Everyone(sudo,openssh-lpk) just has their own scheme for parsing
ldap.conf and the things they accept.

My files look like this to deal with these issues.
---------------------------------------------------------
# @(#)$Id: ldap.conf,v 1.38 2006/05/15 08:13:31 luk
#
# will fail over in software but not load balance
host ldapslavelb.ops.ec.com
host ldapslave1.ops.ec.com
host ldapslave2.ops.ec.com
uri ldap://ldapslavelb.ops.ec.com
uri ldap://ldapslave1.ops.ec.com
uri ldap://ldapslave2.ops.ec.com
ssl start_tls
port 389

timeout 5
bind_timeout 5
bind_policy soft

----------------------------------------------------------

I put my load balancer first,
host ldapslavelb.ops.ec.com
and individual hosts second,
host ldapslave1.ops.ec.com

I include host and uri, even though its redundant.

So does anyone know the truth about this? Is ldaps considered
deprecated in some circles?