Full_Name: Pierangelo Masarati Version: HEAD OS: Linux Whitebox 3 URL: ftp://ftp.openldap.org/incoming/masarati-2005-06-28-libldap-abandon.patch Submission from: (NULL) (81.72.89.40) Submitted by: ando I've been hitting unesplicable core dumps when heavily loading back-ldap with abandoning; I tracked down the issue in do_abandon calling ldap_free_request() on a dangling LDAPRequest pointer. I'm not 100% sure about the fix, because I didn't understand yet how it could happen, but the problem is cured by the attached patch. It appears that when do_abandon() temporarily releases the request mutex to delete the response message and re-acquires it, the LDAPRequest the previously acquired "lr" pointer gets freed by someone else. The fix is trivial: re-fetch the request before continuing the abandon, right after re-acquiring the mutex. The problem cannot be easily reproduced, but I was able to repeatedly produce it after some heavy and concurrent load. I'm keen to committing this fix: at least it's not making any harm, but there might be better solutions. p. <patch> Index: libraries/libldap/abandon.c =================================================================== RCS file: /repo/OpenLDAP/pkg/ldap/libraries/libldap/abandon.c,v retrieving revision 1.37 diff -u -r1.37 abandon.c --- libraries/libldap/abandon.c 1 Jan 2005 19:49:43 -0000 1.37 +++ libraries/libldap/abandon.c 26 Jun 2005 19:39:47 -0000 @@ -161,6 +161,15 @@ return LDAP_SUCCESS; } + /* fetch again the request that we are abandoning */ + if ( lr != NULL ) { + for ( lr = ld->ld_requests; lr != NULL; lr = lr->lr_next ) { + if ( lr->lr_msgid == msgid ) { /* this message */ + break; + } + } + } + err = 0; if ( sendabandon ) { if( ber_sockbuf_ctrl( ld->ld_sb, LBER_SB_OPT_GET_FD, NULL ) == -1 ) { </patch>
changed notes
changed notes changed state Open to Test moved from Incoming to Software Bugs
I think I've hit the real issue: the ld_abandoned array can be modified by two different portions of code: 1) do_abandon() in libldap/abandon.c, which reallocs the array while protected behind ld_req_mutex 2) ldap_mark_abandoned() in libldap/result.c, which shifts the msgid of the abandoned requests while protected behind ld_res_mutex The contents of the array is also accessed by ldap_abandoned() in libldap/result.c, while protected behind ld_res_mutex Note though that two different mutexes are used to protect access to the same data, so conflicts can well occur. For instance, I just spotted a run past end in ldap_abandoned() where i == 20812 violates a memory segment, while gdb shows that the array terminator -1 is actually located at i == 4058. I think this portion of code needs be rearranged, but I'm still wondering how. p. -- Pierangelo Masarati mailto:pierangelo.masarati@sys-net.it SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497
changed notes changed state Test to Suspended
changed notes changed state Suspended to Test
changed notes changed state Test to Release
changed notes changed state Release to Suspended
The issue has been fixed in HEAD; this ITS can be closed. -- Pierangelo Masarati mailto:pierangelo.masarati@sys-net.it SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497
changed notes changed state Suspended to Release
changed notes changed state Release to Closed
moved from Software Bugs to Archive.Software Bugs
fixed in HEAD,RE23 RE22 NOT fixed yet