[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: SOCKET handling and Netscape SDK bug(?)



"Kurt D. Zeilenga" wrote:

> At 12:13 PM 7/25/00 +0200, Mikael Grehn wrote:
> >Howdy!
> >
> >Two problems/questions, the first one is about SOCKET handling in
> >openLDAP (daemon.c code), the second one is about a possible bug in
> >Netscape SDK4.0.
> >
> >1.
> >    I tried to connect my Lotus Notes client to my LDAP server
> >(openLdap2.0-alpha3 on NT). Sometimes the communication works great,
> >binding, searching etc. BUT sometimes the server shuts down becourse the
> >main listener select-function returns error code 10038. I have not
> >gotten this error when connection/working with other clients.
>
> Can you repeat the problem using code as provided?  Your version
> appears not to be functionally equivalent to that in the HEAD
> branch or the OPENLDAP_REL_ENG_2 branch.

Well I'm using experimental code in openLdap2.0-alpha3 on NT. I am using
Lotus Notes 5.0.1b (release nov. 1999).

I try to explain the situation more in detail:
(As you know) the openLdap consist of 2 "main" threads, one thread waiting
for shutdown order and the listener thread that listens on SOCKET connections
from clients (and creates new thread for each new accepted client). The
listener thread is started in the thread function "slapd_daemon_task(...)" in
/servers/slapd/daemon.c. The function contain a while loop for continous
listening on incoming connections. In the while loop a select function
(socket) waits for the socket state to change. If that happens (and no error
occurs) the socket information TCP/IP stream is mapped using BER/DER encoding
to an LDAP request or response.

But in some occations the select function returns SOCKET_ERROR and slapd is
terminated (shutdown = -1). This could be because the "tail" of a previous
SOCKET is (again) read/noticed by select function but since it isnt the start
of a new socket the select function returns error. When checking the error
with WSAGetLastError() it is error 10038, which means "WSAENOTSOCK" error.

A solution could perhaps be to insert some small delay function to let the
previous socket handling be finished...but this solution dont attract me very
much.

So...what could be the error and what solutions can you recommend?

//------------------------------- select part in slapd_daemon_task
(openLdap2.0-alpha3) ---------------------
switch(ns = select( nfds, &readfds,
#ifdef HAVE_WINSOCK
   /* don't pass empty fd_set */
   ( writefds.fd_count > 0 ? &writefds : NULL ),
#else
   &writefds,
#endif
   NULL, tvp ))
  {
  case SOCKET_ERROR:
   { /* failure - try again */
    int err=sock_errno();
    int wsaErr=WSAGetLastError(); // <---- checking for detailed error code
    printf("WSAGetLastError return error code: %i\n",wsaErr);
    if(err==EBADF&&++ebadf<SLAPD_EBADF_LIMIT)
    {
     continue;
    }
    if( err != EINTR )
    {
     Debug( LDAP_DEBUG_CONNS,"daemon: select failed (%d): %s\n",err,
sock_errstr(err), 0 );
     slapd_shutdown = -1; // slapd is shutting down
    }
   }
   continue;
//----------------------------------


I am really greatful for any hints or comments from anyone about this (even
if you havent got experience about openLDAP itself but only use SOCKET
handling on NT a lot).

Anyone had problems with SOCKET errno = 10038 in other occations?
What can the solution to this be?

--
sincerely

Mikael Grehn
M.Sc
Systems Engineer
Envilogg Datateknik AB