[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Search responses with incorrect msgId

Hi all,

My application is an ldap client based on openLdap suite. I'm having
problems when receiving search responses with incorrect msgId.
The problem is: always returns error -1 being impossible to discriminate
other situations like remote server close.
The application always detach the connection fd when -1 is received (99%
is broken connection).
But we don't want to do close this when "receive a search response with
out-of-context msgId" (usually "out-of-time search result entries" after
sending AbandonRequest from client).

I can't use ldap_msgid, because 'LDAPMessage*' is NULL and causes abort()
(anyway, no special result code is specified for this situation at
ldap_msgid API description).

My code:
  LDAP* handle = (LDAP*) a_ldap;
  LDAPMessage* hmessage (NULL);
  resultCode = ldap_result (handle, LDAP_RES_ANY, LDAP_MSG_ONE, NULL,

// I test the following, hoping result is not -1 with
LDAP_RES_UNSOLICITED, but it returned -1 again...
  if (resultCode.getValue () == -1)
     resultCode = ldap_result (handle, LDAP_RES_UNSOLICITED,
LDAP_MSG_ONE, NULL, &hmessage);

?do you know any other way to identify responses with incorrect msgid?



When we use:
   resultCode = ldap_result (handle, LDAP_RES_ANY, LDAP_MSG_ONE, -->NULL<--, &hmessage);

I would like to clarify that -1 value was returned due to a signal on our process which interrupt the ldap_int_select within the ldap_result (where penultimate parameter (timeout) was NULL).

We removed that signal and test again 'ldap_result' with zeroed-timeout (0 secs and 0 usecs):

   resultCode = ldap_result (handle, LDAP_RES_ANY, LDAP_MSG_ONE, &zeroed_tv, &hmessage);

(even we have test 50 milisecs)

Now, received response with out-of-context msgId, abort the application (due to assert over NULL hmessage when openLdap invokes 'ldap_msgid()'). We test this, changing msgId on response with TCP Test Tool, but actually, this could happen, as i said at former post, when expired context send AbandonRequest and then the ldap server sends the search result entry out of time.
(The caller can expect that the result of an abandoned operation will not be returned from a future call to ldap_result()).


1 - Our ldap server, don't support LDAP_OPT_TIMELIMIT (it is optional), and then could answer out of time after an AbandonRequest received from client (or not) (RFC-compliant).

2 - If we use 'LDAP_RES_ANY' we reach abort() when receive INcorrect message responses. If we use 'LDAP_RES_UNSOLICITED' we reach abort() when receive correct message responses. We can't combine these extract methods: not possible to mask values with OR, they are defined in this way:

#define LDAP_RES_ANY            (-1)

Perhaps it's possible to add a new #define like 'LDAP_RES_ALL=-2' which causes different special value returns on ldap_result() to make a distinction about which kind of message we receive: solicited or unsolicited. Or perhaps it's possible to get a not NULL hmessage on last ldap_result parameter, when something incorrect is received (only would be NULL on break connection event).

We reach openldap debug output like this:
../../../libraries/liblber/io.c:516: ber_get_next: Assertion `ber->ber_buf == ((void *)0)' failed.

3 - Avoid sending AbandonRequest in order to keep all (good/bad) contexts, perhaps solve the problem (we suppose ldap_abandon() clears internal openLdap memory resource (*)). We could send the AbandonRequest sometime later (configurable delay), or never send including any "garbage collector" mechanism (not sure if exist any openLdap primitive to do this without ldap_abandon()).
We don't like any idea like this. What do you think?

(*) anyone could confirm this ?

4 - We are testing the following code now, but think it couldn't be such complex to afford the problem:
(the application is single thread and must keep connection all the time even if incorrect msgId is received)

  ualarm (100000, 0);
  resultCode = ldap_result (handle, LDAP_RES_ANY, LDAP_MSG_ONE, NULL, &hmessage);

// If we receive an unexpected message, ldap_result gets blocked on ldap_int_select due to the extract
//  method (LDAP_RES_ANY), and the penultimate parameter (NULL is infinite timeout). But this situation
//  is interrupted with former 'ualarm' signal (100 ms delay)

  int xerrno = errno;
  ualarm (0, 0);

  if (resultCode == -1) {
     int result = -1;
     bool disconnect (false);

     if (ldap_get_option (handle, LDAP_OPT_RESULT_CODE, &result) != LDAP_OPT_SUCCESS)
        disconnect = true;
     else if (result == LDAP_SERVER_DOWN)
        disconnect = true;
     else if (result == LDAP_CONNECT_ERROR)
        disconnect = true;
if (disconnect == true && xerrno == EINTR)
        disconnect = false;

Indeed, we don't like this method, because we are not sure if fd stream is corrupted after an unexpected message,
and probably could affect the next normal ones. We think it must be any simple way to manage this.

We can't found a good point to manage the problem.
any idea?