Full_Name: Aaron Richton Version: RE24 OS: Fedora 20 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (128.6.31.135) While performing RE24 testing observed an issue in slapd-mtread: ==28795== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==28795== Access not within mapped region at address 0x2FE9F9D0 ==28795== at 0x3A45E0A1%3: pthread_join (pthread_join.c:47) ==28795== by 0x4A09005: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0x4A0BBA5: pthread_join (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0 0x409A2A: ldap_pvt_thread_join (thr_posix.c:197) ==28795== by 0x405540: main (slapd-mtread.c:361) Hmm, not particularly helpful. Gave this a run through helgrind, report below. So far as I can tell we had one thread in ldap_dup protected by ldcmutex: 542 LDAP_MUTEX_LOCK( &old->ld_ldcmutex ); 543 ld->ldc = old->ldc; 544 old->ld_ldcrefcnt++; 545 LDAP_MUTEX_UNLOCK( &old->ld_ldcmutex ); while ldap_get_option(LDAP_OPT_SESSION_REFCNT) is protected by ldo_mutex only. As a quick attempt: --- a/libraries/libldap/options.c +++ b/libraries/libldap/options.c @@ -123,6 +123,10 @@ ldap_get_option( return LDAP_OPT_ERROR; } + if( option == LDAP_OPT_SESSION_REFCNT ) { + LDAP_MUTEX_LOCK( &ld->ld_ldcmutex ); + } + LDAP_MUTEX_LOCK( &lo->ldo_mutex ); switch(option) { @@ -414,6 +418,11 @@ ldap_get_option( } LDAP_MUTEX_UNLOCK( &lo->ldo_mutex ); + + if( option == LDAP_OPT_SESSION_REFCNT ) { + LDAP_MUTEX_UNLOCK( &ld->ld_ldcmutex ); + } + return ( rc ); } helgrind looks happy with this. The if() is obviously optional but is hopefully a small performance gain for ldap_get_option(others). helgrind report: ==28795== Lock at 0x4C393D0 was first observed ==28795== at 0x4A0BCCF: pthread_mutex_init (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0x409B08: ldap_pvt_thread_mutex_init (thr_posix.c:383) ==28795== by 0x40A25A: ldap_create (open.c:180) ==28795== by 0x40A3E4: ldap_initialize (open.c:240) ==28795== by 0x4062B7: do_conn (slapd-mtread.c:584) ==28795== by 0x40530A: main (slapd-mtread.c:331) ==28795== ==28795== Lock at 0x4C393A8 was first observed ==28795== at 0x4A0BCCF: pthread_mutex_init (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0x409B08: ldap_pvt_thread_mutex_init (thr_posix.c:283) ==28795== by 0x409FBF: ldap_create (open.c:129) ==28795%%3= by 0xA3A3E4: ldap_initialize (open.c:240) ==28795== by 0x4062B7: do_conn (slapd-mtread.c:584) ==28795== by 0x40530A: main (slapd-mtread.c:331) ==28795== ==28795== Possible data race during write of size 4 at 0x4C39238 by thread #%2 ==28795== Locks held: 1, at address 0x4C393D0 ==28795== at 0x412AC1: ldap_ld_free (unbind.c:87) ==28795== by 0x4130EE: ldap_destroy (unbind.c:245) ==28795== by 0x405BEF: do_onethread (slapd-mtread.c:455) ==28795== by 0x4A0AECD: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0x3A45E07EE4: start_thread (pthread_create.c:309) ==28795== by 0x3A45AF4B8C: clone (clone.S:111) ==28795== ==28795== This conflicts with a previous read of size 4 by thread #29 ==28795%%3= Locks held: 1, at address 0x4C393A8 ==28795== at 0x426957: ldap_get_option (options.c:374) ==28795== by 0x405914: do_onethread (slapd-mtread.c:423) ==28795== by 0x4A0AECD: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0x3A45E07EE4: start_thread (pthread_create.c:309) ==28795== by 0x3A45AF4B8C: clone (clone.S:111) ==28795== ==28795== Address 0x4C39238 is 88 bytes inside a block of size 736 alloc'd ==28795== at 0x4A083F4: calloc (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) ==28795== by 0x43E4B1: ber_memcalloc_x (memory.c:283) ==28795== by 0x409F4B: ldap_create (open.c:119) ==28795== by 0x40A3E4: ldap_initialize (open.c:240) ==28795== by 0x4062B7: do_conn (slapd-mtread.c:584) ==28795== by 0x40530A: main (slapd-mtread.c:331)
richton@nbcs.rutgers.edu wrote: > Full_Name: Aaron Richton > Version: RE24 > OS: Fedora 20 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (128.6.31.135) > > > While performing RE24 testing observed an issue in slapd-mtread: > > ==28795== Process terminating with default action of signal 11 (SIGSEGV): > dumping core > ==28795== Access not within mapped region at address 0x2FE9F9D0 > ==28795== at 0x3A45E0A1%3: pthread_join (pthread_join.c:47) > ==28795== by 0x4A09005: ??? (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0x4A0BBA5: pthread_join (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0 0x409A2A: ldap_pvt_thread_join (thr_posix.c:197) > ==28795== by 0x405540: main (slapd-mtread.c:361) > > Hmm, not particularly helpful. Gave this a run through helgrind, report below. > So far as I can tell we had one thread in ldap_dup protected by ldcmutex: > > 542 LDAP_MUTEX_LOCK( &old->ld_ldcmutex ); > 543 ld->ldc = old->ldc; > 544 old->ld_ldcrefcnt++; > 545 LDAP_MUTEX_UNLOCK( &old->ld_ldcmutex ); > > while ldap_get_option(LDAP_OPT_SESSION_REFCNT) is protected by ldo_mutex only. > As a quick attempt: Thanks. I see your patch is this way to avoid lock ordering issues. The original code in libldap/unbind.c seems to hold ldcmutex for the entire duration of the function, without any good reason. I suspect we can just move the mutex lock/unlock to just around the actual option fetch. > --- a/libraries/libldap/options.c > +++ b/libraries/libldap/options.c > @@ -123,6 +123,10 @@ ldap_get_option( > return LDAP_OPT_ERROR; > } > > + if( option == LDAP_OPT_SESSION_REFCNT ) { > + LDAP_MUTEX_LOCK( &ld->ld_ldcmutex ); > + } > + > LDAP_MUTEX_LOCK( &lo->ldo_mutex ); > > switch(option) { > @@ -414,6 +418,11 @@ ldap_get_option( > } > > LDAP_MUTEX_UNLOCK( &lo->ldo_mutex ); > + > + if( option == LDAP_OPT_SESSION_REFCNT ) { > + LDAP_MUTEX_UNLOCK( &ld->ld_ldcmutex ); > + } > + > return ( rc ); > } > > helgrind looks happy with this. The if() is obviously optional but is hopefully > a small performance gain for ldap_get_option(others). > > helgrind report: > > ==28795== Lock at 0x4C393D0 was first observed > ==28795== at 0x4A0BCCF: pthread_mutex_init (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0x409B08: ldap_pvt_thread_mutex_init (thr_posix.c:383) > ==28795== by 0x40A25A: ldap_create (open.c:180) > ==28795== by 0x40A3E4: ldap_initialize (open.c:240) > ==28795== by 0x4062B7: do_conn (slapd-mtread.c:584) > ==28795== by 0x40530A: main (slapd-mtread.c:331) > ==28795== > ==28795== Lock at 0x4C393A8 was first observed > ==28795== at 0x4A0BCCF: pthread_mutex_init (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0x409B08: ldap_pvt_thread_mutex_init (thr_posix.c:283) > ==28795== by 0x409FBF: ldap_create (open.c:129) > ==28795%%3= by 0xA3A3E4: ldap_initialize (open.c:240) > ==28795== by 0x4062B7: do_conn (slapd-mtread.c:584) > ==28795== by 0x40530A: main (slapd-mtread.c:331) > ==28795== > ==28795== Possible data race during write of size 4 at 0x4C39238 by thread #%2 > ==28795== Locks held: 1, at address 0x4C393D0 > ==28795== at 0x412AC1: ldap_ld_free (unbind.c:87) > ==28795== by 0x4130EE: ldap_destroy (unbind.c:245) > ==28795== by 0x405BEF: do_onethread (slapd-mtread.c:455) > ==28795== by 0x4A0AECD: ??? (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0x3A45E07EE4: start_thread (pthread_create.c:309) > ==28795== by 0x3A45AF4B8C: clone (clone.S:111) > ==28795== > ==28795== This conflicts with a previous read of size 4 by thread #29 > ==28795%%3= Locks held: 1, at address 0x4C393A8 > ==28795== at 0x426957: ldap_get_option (options.c:374) > ==28795== by 0x405914: do_onethread (slapd-mtread.c:423) > ==28795== by 0x4A0AECD: ??? (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0x3A45E07EE4: start_thread (pthread_create.c:309) > ==28795== by 0x3A45AF4B8C: clone (clone.S:111) > ==28795== > ==28795== Address 0x4C39238 is 88 bytes inside a block of size 736 alloc'd > ==28795== at 0x4A083F4: calloc (in > /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so) > ==28795== by 0x43E4B1: ber_memcalloc_x (memory.c:283) > ==28795== by 0x409F4B: ldap_create (open.c:119) > ==28795== by 0x40A3E4: ldap_initialize (open.c:240) > ==28795== by 0x4062B7: do_conn (slapd-mtread.c:584) > ==28795== by 0x40530A: main (slapd-mtread.c:331) > > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed notes changed state Open to Test moved from Incoming to Software Bugs
changed notes changed state Test to Release
fixed in master fixed in RE25 fixed in RE24
changed notes changed state Release to Closed