[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: double close socket in libldap



Test Seven wrote:
   We sometimes experience an invalid close socket call inside ldap
client library. (Why do we care? It signals a bug that may disrupt an
innocent thread if the innocent thread happens to open a socket with the
same descriptor number as the buggy thread is trying to close.) We
tracked down that the same socket is closed twice in rapid succession.

Do you know how could it happen that sb_iod and sb_iod->next share the
same sbiod_sb?

What version of OpenLDAP are you using? Your trace here shows an sbiod_io that doesn't exist in OpenLDAP, so it seems like you've installed a custom handler. There is no mbufFlush in OpenLDAP, and sb_fd_setup is only used with sb_fd_*, not sb_stream_*. How did you configure the sockbuf originally?

Below: callstack, data of LDAP* conn, relevant code, description of the
data and code, workaround.

my_app_1!closesocket
my_app_1!sb_stream_close+0x21
my_app_1!ber_int_sb_close+0x1e
my_app_1!ldap_free_connection+0x28d
my_app_1!ldap_ld_free+0x51


     +0x000 conn             : 0x0523fa80 ldap
        +0x000 ld_sb            : 0x0526b3b8 sockbuf
           +0x000 sb_opts          : lber_options
              +0x000 lbo_valid        : 0n3
              +0x002 lbo_options      : 0
              +0x004 lbo_debug        : 0n0
           +0x008 sb_iod           : 0x05259dd8 sockbuf_io_desc
              +0x000 sbiod_level      : 0n10
              +0x004 sbiod_sb         : 0x0526b3b8 sockbuf
                 +0x000 sb_opts          : lber_options
                    +0x000 lbo_valid        : 0n3
                    +0x002 lbo_options      : 0
                    +0x004 lbo_debug        : 0n0
                 +0x008 sb_iod           : 0x05259dd8 sockbuf_io_desc
                    +0x000 sbiod_level      : 0n10
                    +0x004 sbiod_sb         : 0x0526b3b8 sockbuf
                    +0x008 sbiod_io         : 0x0091ef88 sockbuf_io
                    +0x00c sbiod_pvt        : (null)
                    +0x010 sbiod_next       : 0x0520c380 sockbuf_io_desc
                 +0x00c sb_fd            : 0x1534
                 +0x010 sb_max_incoming  : 0
                 +0x014 sb_trans_needs_read : 0y0
                 +0x014 sb_trans_needs_write : 0y0
              +0x008 sbiod_io         : 0x0091ef88 sockbuf_io
                 +0x000 sbi_setup        : 0x006ace40                 int
my_app_1!sb_fd_setup+0
                 +0x004 sbi_remove       : (null)
                 +0x008 sbi_ctrl         : 0x0052cdd0                 int
my_app_1!mbufFlush+0
                 +0x00c sbi_read         : 0x006acdd0
long my_app_1!sb_stream_read+0
                 +0x010 sbi_write        : 0x006acdf0
long my_app_1!sb_stream_write+0
                 +0x014 sbi_close        : 0x006ace10                 int
my_app_1!sb_stream_close+0
              +0x00c sbiod_pvt        : (null)
              +0x010 sbiod_next       : 0x0520c380 sockbuf_io_desc
                 +0x000 sbiod_level      : 0n10
                 +0x004 sbiod_sb         : 0x0526b3b8 sockbuf
                    +0x000 sb_opts          : lber_options
                    +0x008 sb_iod           : 0x05259dd8 sockbuf_io_desc
                    +0x00c sb_fd            : 0x1534
                    +0x010 sb_max_incoming  : 0
                    +0x014 sb_trans_needs_read : 0y0
                    +0x014 sb_trans_needs_write : 0y0
                 +0x008 sbiod_io         : 0x0091ef88 sockbuf_io
                    +0x000 sbi_setup        :
0x006ace40                    int my_app_1!sb_fd_setup+0
                    +0x004 sbi_remove       : (null)
                    +0x008 sbi_ctrl         :
0x0052cdd0                    int my_app_1!mbufFlush+0
                    +0x00c sbi_read         :
0x006acdd0                    long my_app_1!sb_stream_read+0
                    +0x010 sbi_write        :
0x006acdf0                    long my_app_1!sb_stream_write+0
                    +0x014 sbi_close        :
0x006ace10                    int my_app_1!sb_stream_close+0
                 +0x00c sbiod_pvt        : (null)
                 +0x010 sbiod_next       : (null)
           +0x00c sb_fd            : 0x1534
           +0x010 sb_max_incoming  : 0
           +0x014 sb_trans_needs_read : 0y0
           +0x014 sb_trans_needs_write : 0y0
(etc.)


ber_int_sb_close():
      p = sb->sb_iod;
      while ( p ) {
          if ( p->sbiod_io->sbi_close&&  p->sbiod_io->sbi_close( p )<  0 ) {
              return -1;
          }
          p = p->sbiod_next;
      }

sb_stream_close( Sockbuf_IO_Desc *sbiod ):
      tcp_close( sbiod->sbiod_sb->sb_fd );


So the socket number 0x1534 was closed twice in rapid succession,
because there are 2 linked list nodes, both with the same sb_fd
(actually, both sbiod_sb point to 0x0526b3b8 sockbuf, which is their
"parent" a.k.a. conn->ld_sb, i.e.
conn->ld_sb->sb_iod->sbiod_sb == conn->ld_sb
and also == conn->ld_sb->sb_iod->sbiod_next->sbiod_sb).
The question is how/when the sbiod_next was added.


A dirty workaround would be to detect this special case in
ber_int_sb_close(), but I'd strongly prefer to fix the problem.
          p = sb->sb_iod;
          while ( p ) {
                  if ( p->sbiod_io->sbi_close&&  p->sbiod_io->sbi_close(
p )<  0 ) {
                          return -1;
                  }
+               // workaround for double-closesocket:
+               while (p->sbiod_next
// there's another sbiod
+&&  (p->sbiod_next->sbiod_sb == p->sbiod_sb)     // points to the same
socket
+&&  (p->sbiod_next->sbiod_io == p->sbiod_io)     // same functions used
+&&  (p->sbiod_io->sbi_close == sb_stream_close)  // tcp
+                     ) {
+                          p = p->sbiod_next;    // skip, don't close it
again
+               }
+               // end of workaround
                  p = p->sbiod_next;
          }




--
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/