[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: need help diagnosing seg fault on solaris 2.6



At 01:37 PM 3/12/2003, JR Mayberry wrote:

>I'm running Solaris 2.6 w/ gcc 2.9.5, openldap 2.0.27, openssl 0.9.6g...
>I've tried many other versions of gcc/openldap/openssl in many different
>combinations and get the same result...
>
>I run this command to test:
>truss -f /opt/openldap/bin/ldapsearch -ZZ -h ldap.domain.com -x -W -D
>"cn=readonly,dc=domain,dc=com" -b 'dc=domain,dc=com'
>'(objectClass=posixAccount)'
>
>FYI - It works fine with non-ssl/tls connections.
>
>I get this (on client):
>
>10145:  open("/etc/hosts", O_RDONLY)                    = 3
>10145:  fstat64(3, 0xEFFFCD60)                          = 0
>10145:  ioctl(3, TCGETA, 0xEFFFCCEC)                    Err#25 ENOTTY
>10145:  read(3, " #\n #   I n t e r n e t".., 8192)     = 152
>10145:  read(3, 0x0004354C, 8192)                       = 0
>10145:  llseek(3, 0, SEEK_CUR)                          = 152
>10145:  close(3)                                        = 0
>10145:  so_socket(2, 2, 0, "", 1)                       = 3
>10145:  fcntl(3, F_GETFL, 0xEFFFF298)                   = 2
>10145:  fstat64(3, 0xEFFFF030)                          = 0
>10145:  getsockopt(3, 65535, 8192, 0xEFFFF134, 0xEFFFF12C) = 0
>10145:  fstat64(3, 0xEFFFF030)                          = 0
>10145:  getsockopt(3, 65535, 8192, 0xEFFFF134, 0xEFFFF130) = 0
>10145:  setsockopt(3, 65535, 8192, 0xEFFFF134, 4)       = 0
>10145:  fcntl(3, F_SETFL, 0x00000082)                   = 0
>10145:  connect(3, 0xEFFFF3A0, 16)                      Err#150
>EINPROGRESS
>10145:  poll(0xEFFFD218, 1, -1)                         = 1
>10145:  getpeername(3, 0xEFFFF1F8, 0xEFFFF1F4)          = 0
>10145:  fcntl(3, F_GETFL, 0x00000000)                   = 130
>10145:  fstat64(3, 0xEFFFF030)                          = 0
>10145:  getsockopt(3, 65535, 8192, 0xEFFFF134, 0xEFFFF12C) = 0
>10145:  fstat64(3, 0xEFFFF030)                          = 0
>10145:  getsockopt(3, 65535, 8192, 0xEFFFF134, 0xEFFFF130) = 0
>10145:  setsockopt(3, 65535, 8192, 0xEFFFF134, 4)       = 0
>10145:  fcntl(3, F_SETFL, 0x00000002)                   = 0
>10145:  brk(0x00047320)                                 = 0
>10145:  brk(0x00049320)                                 = 0
>10145:  time()                                          = 1047503920
>10145:  write(3, " 01D020101 w188016 1 . 3".., 31)      = 31
>10145:  poll(0xEFFFD4B0, 1, -1)                         = 1
>10145:  read(3, " 0\f020101 x07\n01\004\0".., 16384)    = 14
>10145:  time()                                          = 1047503920
>10145:      Incurred fault #6, FLTBOUNDS  %pc = 0x00625D60
>10145:        siginfo: SIGSEGV SEGV_MAPERR addr=0x00625D60
>10145:      Received signal #11, SIGSEGV [default]
>10145:        siginfo: SIGSEGV SEGV_MAPERR addr=0x00625D60
>10145:          *** process killed ***
>
>
>On openldap server:
>
>aemon: added 6r
>daemon: added 7r
>daemon: select: listen=6 active_threads=0 tvp=NULL
>daemon: select: listen=7 active_threads=0 tvp=NULL
>daemon: activity on 1 descriptors
>daemon: new connection on 8
>daemon: added 8r
>daemon: activity on:
>daemon: select: listen=6 active_threads=0 tvp=NULL
>daemon: select: listen=7 active_threads=0 tvp=NULL
>daemon: activity on 1 descriptors
>daemon: activity on: 8r
>daemon: read activity on 8
>connection_get(8)
>connection_get(8): got connid=0
>connection_read(8): checking for input on id=0
>ber_get_next
>ldap_read: want=1, got=1
>  0000:  30                                                 0
>ldap_read: want=1, got=1
>  0000:  1d                                                 .
>ldap_read: want=29, got=29
>  0000:  02 01 01 77 18 80 16 31  2e 33 2e 36 2e 31 2e 34
>...w...1.3.6.1.4
>  0010:  2e 31 2e 31 34 36 36 2e  32 30 30 33 37            .1.1466.20037
>ber_get_next: tag 0x30 len 29 contents:
>ber_dump: buf=0x000b4410 ptr=0x000b4410 end=0x000b442d len=29
>  0000:  02 01 01 77 18 80 16 31  2e 33 2e 36 2e 31 2e 34
>...w...1.3.6.1.4
>  0010:  2e 31 2e 31 34 36 36 2e  32 30 30 33 37            .1.1466.20037
>ber_get_next
>ldap_read: want=1 error=Resource temporarily unavailable
>ber_get_next on fd 8 failed errno=11 (Resource temporarily unavailable)
>do_extended
>ber_scanf fmt ({a) ber:
>ber_dump: buf=0x000b4410 ptr=0x000b4413 end=0x000b442d len=26
>  0000:  77 18 80 16 31 2e 33 2e  36 2e 31 2e 34 2e 31 2e
>w...1.3.6.1.4.1.
>  0010:  31 34 36 36 2e 32 30 30  33 37                     1466.20037
>do_extended: oid=1.3.6.1.4.1.1466.20037
>send_ldap_extended 0: (0)
>send_ldap_response: msgid=1 tag=120 err=0
>ber_flush: 14 bytes to sd 8
>  0000:  30 0c 02 01 01 78 07 0a  01 00 04 00 04 00         0....x........
>ldap_write: want=14, written=14
>  0000:  30 0c 02 01 01 78 07 0a  01 00 04 00 04 00         0....x........
>daemon: select: listen=6 active_threads=1 tvp=NULL
>daemon: select: listen=7 active_threads=1 tvp=NULL
>daemon: activity on 1 descriptors
>daemon: activity on: 8r
>daemon: read activity on 8
>connection_get(8)
>connection_get(8): got connid=0
>connection_read(8): checking for input on id=0
>TLS trace: SSL_accept:before/accept initialization
>tls_read: want=11, got=0
>
>TLS: can't accept.
>connection_read(8): TLS accept error error=-1 id=0, closing
>connection_closing: readying conn=0 sd=8 for close
>connection_close: conn=0 sd=8
>daemon: removing 8
>daemon: select: listen=6 active_threads=0 tvp=NULL
>daemon: select: listen=7 active_threads=0 tvp=NULL
>daemon: activity on 1 descriptors
>daemon: select: listen=6 active_threads=0 tvp=NULL
>daemon: select: listen=7 active_threads=0 tvp=NULL
>
>
>
>I don't even know where to start...

I suggest you start by running slapd(8) under gdb(1) and
when it crashes, type 'bt'.  Assuming the stack is not
completely trashed, this will tell you exactly where
the segfault occurred.  In many cases, just poking about
in the debugger will locate the error which caused the
fault.  (Of course, in some cases, the problem may not be
obvious.)

>I've built many times on many
>platforms and all work, the server is serving out data to many clients
>fine over TLS..