[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#5525) libldap causes a core dump due to accessing freed memory



Full_Name: Paul Smith
Version: 2.4.7
OS: Ubuntu 8.04
URL: 
Submission from: (NULL) (65.78.30.67)


I've been trying to track down a crasher in Evolution's Exchange server backend,
and I believe it's a bug in OpenLDAP's libldap.  The core manifests as an
assert() in io.c:ber_flush2(), but by using valgrind I can see that the real
problem is using already-freed memory.

You can see various backtraces and valgrind outputs in the Gnome bug tracking
system here: http://bugzilla.gnome.org/show_bug.cgi?id=512605  Note that the
later stack traces are better as I've installed debugging versions of OpenLDAP
so there are better symbols shown.

In Evolution we have a loop we use to look up global contacts, which looks like
this:

        for (try = 0; try < 2; try++) {
            ldap_error = get_gc_connection (gc, op);
            if (ldap_error != LDAP_SUCCESS)
        	return ldap_error;
            ldap_error = ldap_search_ext (gc->priv->ldap, base, scope,
        			          filter, (char **)attrs,
        			          FALSE, NULL, NULL, NULL, 0,
        			          &msgid);
            if (ldap_error == LDAP_SERVER_DOWN)
        	continue;
            else if (ldap_error != LDAP_SUCCESS)
        	return ldap_error;
        
            ldap_error = gc_ldap_result (gc->priv->ldap, op, msgid, msg);
            if (ldap_error == LDAP_SERVER_DOWN)
        	continue;
            else if (ldap_error != LDAP_SUCCESS)
        	return ldap_error;
        
            return LDAP_SUCCESS;
        }

where get_gc_connection() eventually calls ldap_ntlm_bind() and where
gc_ldap_result() calls ldap_result().  When the core happens, the first trip
through the for loop here gives an error LDAP_SERVER_DOWN as the return code
from ldap_result().  We go back to the top of the loop, and ldap_ntlm_bind() is
called.  That eventually ends up in ldap_send_initial_request() which invokes
ldap_send_server_request() like this:

        rc = ldap_send_server_request( ld, ber, msgid, NULL,
        	NULL, NULL, NULL );

Inside ldap_send_server_request() we need to find an LDAPConn* structure, but
the one passed in is NULL as is the srvlist, so, we use the default connection:

        if ( lc == NULL ) {
            if ( srvlist == NULL ) {
        	lc = ld->ld_defconn;

Valgrind reports that the very next line, attempting to access lc->lconn_status,
is referencing freed memory:

        if ( lc != NULL && lc->lconn_status == LDAP_CONNST_CONNECTING ) {

So, basically, we know that ld->ld_defconn is a non-NULL pointer pointing to
already freed memory.  This seems like it is a buggy state.

How did we get here?  In ldap_result() we get into this loop invoking
try_read1msg() (I'm removing the mutex locking for readability, even though it
IS enabled in my version of libldap):

	for ( lc = ld->ld_conns;
	    rc == LDAP_MSG_X_KEEP_LOOKING && lc != NULL; )
	{
	    if ( lc->lconn_status == LDAP_CONNST_CONNECTED &&
		ldap_is_read_ready( ld, lc->lconn_sb ) )
	    {
		rc = try_read1msg( ld, msgid, all, &lc, result );
		if ( lc == NULL ) {
		    /* if lc gets free()'d,
		     * there's no guarantee
		     * lc->lconn_next is still
		     * sane; better restart
		     * (ITS#4405) */
		    lc = ld->ld_conns;

		    /* don't get to next conn! */
		    break;
		}
	    }

	    /* next conn */
	    lc = lc->lconn_next;
	}

Inside try_read1msg() we call "tag = ber_get_next( lc->lconn_sb, &len, ber );"
and if the tag we get back is LBER_DEFAULT, then we declare an error and we free
the connection and set the pointer to NULL:

	case LBER_DEFAULT:
		ld->ld_errno = LDAP_SERVER_DOWN;
		ldap_free_connection( ld, lc, 1, 0 );
		lc = *lcp = NULL;
		return -1;

If you check ldap_free_connection() you'll see that it removes the LDAPConn
pointer "lc" from the list of connections before it is freed.

BUT!  The ldap_free_connection() function never does anything with the
ld->ld_defconn pointer, so if the connection we just freed is the one pointed to
by ld->ld_defconn, it is now pointing to freed memory.  And that causes the
problem detected above by valgrind, or causing an assert later on: accessing
freed memory.

I'm not really sure what the right thing to do here is, or I'd provide a patch. 
Should we set ld_defconn to NULL?  Is that ever a valid state?  Or should we
just pick another connection from the list (and what if there isn't one?)