Issue 3192 - Hang in wait4msg() during search
Summary: Hang in wait4msg() during search
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-19 01:32 UTC by ipuleston@sonicwall.com
Modified: 2014-08-01 21:06 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description ipuleston@sonicwall.com 2004-06-19 01:32:33 UTC
Full_Name: Ian Puleston
Version: 2.1.29
OS: VxWorks
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (64.220.173.243)


I am working on a port of OpenLDAP that runs under VxWorks and have encountered
a problem where during a search request the LDAP client hangs in
ldap_int_select() called from ldap_result() via wait4msg(). This is a timing
issue and I only see it when using TLS to Active Directory on a Windows 2000
server. Non-TLS to the same server works OS, as do both TLS and non-TLS to an
OpenLDAP Server.

This may be the problem causing issue #s 3015, 3054 and 3124, and I have a
solution (see below). First here's the trace:

ldap_result msgid -1
ldap_chkResponseList for msgid=-1, all=0
ldap_chkResponseList returns NULL
wait4msg (infinite timeout), msgid -1
wait4msg continue, msgid -1, all 0
** Connections:
* host: ianserver.sd80.com  port: 636  (default)
  refcnt: 2  status: Connected
  last used: FRI JUN 18 23:08:19 2004

** Outstanding Requests:
 * msgid 2,  origid 2, status InProgress
   outstanding referrals 0, parent count 0
** Response Queue:
   Empty
ldap_chkResponseList for msgid=-1, all=0
ldap_chkResponseList returns NULL
read1msg: msgid -1, all 0
ldap_read: message type search-result msgid 2, original id 2
new result:  res_errno: 0, res_error: <>, res_matched: <>
read1msg:  0 new referrals
read1msg:  mark request completed, id = 2
request 2 done
res_errno: 0, res_error: <>, res_matched: <>
ldap_free_request (origid 2, msgid 2)
ldap_free_connection
ldap_free_connection: refcnt 1
ldap_int_select

And there it hangs ad-infinitum. If I change the code to use a timeout in the
select then it times out and the search completes OK (but slowly).

When it goes wrong and hangs data is already available, the
LBER_SB_OPT_DATA_READY BER request returns true and try_read1msg() is called.
When it does not hang no data is available at that point and it must do the
select to wait for the data.

The problem appears to be that wait4msg checks for lc == NULL to decide whether
it needs to wait for data on the socket, but if the data is already received
then try_read1msg() returns with lc set to NULL. That then causes
ldap_int_select() to get called even if there is no more data coming, resulting
in the hang up / timeout.

My fix, which I've tested and it seems to work fine, is to change the following
code in wait4msg() (from result.c in OpenLDAP 2.1.29:

        if( (*result = chkResponseList(ld, msgid, all)) != NULL ) {
            rc = (*result)->lm_msgtype;
        } else {

			for ( lc = ld->ld_conns; lc != NULL; lc = nextlc ) {
				nextlc = lc->lconn_next;
				if ( ber_sockbuf_ctrl( lc->lconn_sb,
						LBER_SB_OPT_DATA_READY, NULL ) ) {
					rc = try_read1msg( ld, msgid, all, lc->lconn_sb,
						&lc, result );
				    break;
				}
	        }

		    if ( lc == NULL ) {
			    rc = ldap_int_select( ld, tvp );


To:

        if( (*result = chkResponseList(ld, msgid, all)) != NULL ) {
            rc = (*result)->lm_msgtype;
        } else {
			int found_msg = 0;

			for ( lc = ld->ld_conns; lc != NULL; lc = nextlc ) {
				nextlc = lc->lconn_next;
				if ( ber_sockbuf_ctrl( lc->lconn_sb,
						LBER_SB_OPT_DATA_READY, NULL ) ) {
					rc = try_read1msg( ld, msgid, all, lc->lconn_sb,
						&lc, result );
					found_msg = 1;
				    break;
				}
	        }

		    if ( !found_msg ) {
			    rc = ldap_int_select( ld, tvp );


And here are the diffs to result.c for the patch:
299a300
>                       int found_msg = 0;
306a308
>                                       found_msg = 1;
311c313
<                   if ( lc == NULL ) {
---
>                   if ( !found_msg ) {

Comment 1 Kurt Zeilenga 2004-06-22 20:34:44 UTC
changed state Open to Feedback
moved from Incoming to Historical
Comment 2 Kurt Zeilenga 2004-06-22 20:36:25 UTC
Please update to the latest version of OpenLDAP Software
and retest.  It is likely the problem you are attempting
to fix has already been resolved.  Thanks, Kurt
Comment 3 Kurt Zeilenga 2004-06-22 20:36:35 UTC
changed notes
Comment 4 Kurt Zeilenga 2004-06-22 20:37:36 UTC
changed state Feedback to Suspended
Comment 5 ipuleston@sonicwall.com 2004-06-22 21:41:57 UTC
Hi Kurt,

Thanks for getting back to me. I've just completed upgrading my port to
OpenLDAP 2.2.13, and the problem still exists in there. I've verified that
my proposed fix works in that version too.

Ian

> -----Original Message-----
> From: Kurt Zeilenga [mailto:openldap-its@OpenLDAP.org]
> Sent: Tuesday, June 22, 2004 1:36 PM
> To: Ian Puleston
> Subject: Re: Hang in wait4msg() during search (ITS#3192)
> 
> Please update to the latest version of OpenLDAP Software
> and retest.  It is likely the problem you are attempting
> to fix has already been resolved.  Thanks, Kurt


Comment 6 Kurt Zeilenga 2004-09-05 00:01:38 UTC
changed notes
moved from Historical to Software Bugs
Comment 7 ipuleston@sonicwall.com 2004-10-06 19:36:04 UTC
I notice that the fix above has been incorporated into the code at some
point between versions 2.2.14 and 2.2.17, although it's not mentioned in the
CHANGES file. This issue could now be closed.




Comment 8 Kurt Zeilenga 2004-10-06 19:45:43 UTC
changed notes
changed state Suspended to Closed
Comment 9 Howard Chu 2004-10-06 23:05:58 UTC
changed notes
Comment 10 Howard Chu 2004-10-06 23:25:14 UTC
ipuleston@SonicWALL.com wrote:

>I notice that the fix above has been incorporated into the code at some
>point between versions 2.2.14 and 2.2.17, although it's not mentioned in the
>CHANGES file. This issue could now be closed.
>  
>
Looks like the fix was incorporated due to ITS#3250, just fyi.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

Comment 11 Howard Chu 2009-02-17 05:05:55 UTC
moved from Software Bugs to Archive.Software Bugs
Comment 12 OpenLDAP project 2014-08-01 21:06:32 UTC
fixed in 2.2 (see ITS#3250)