Full_Name: Chris Kowalczyk Version: 2.4.41 OS: SLE12 URL: ftp://ftp.openldap.org/incoming/ Submission from: (NULL) (2620:113:80c0:5::2222) Hello, I have been experiencing segfaults in mdb_env_reader_dest happening randomly, only from time to time, during shutting off openldap. After some investigation I narrowed the problem down to mdb_env_reader_dest destructor function being called after its object had been deleted while powering off the application. mdb_env_reader_dest was called automatically, because it was agginned to the object in pthread_key_create call: https://github.com/openldap/openldap/blob/master/libraries/liblmdb/mdb.c#L4722 The problem was not happening often, and always during powering off, so no data was lost etc. However, due to its annoyance, I prepared a simple patch, which prevents mdb_env_reader_dest function being called if the object was already deleted. The patch works fine, I haven't seen the problem since it was introduced. I am using OpenLDAP 2.4.41, but the code seems to be the same in other versions as well. Would you like me to create a pull request for it? ========================== === The backtrace: ========================== Program terminated with signal 11, Segmentation fault. #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at ../../../libraries/liblmdb/mdb.c:4050 4050 reader->mr_pid = 0; #thread apply all bt Thread 2 (LWP 20920): #0 0x00007f4a9ac31d9d in close () at ../sysdeps/unix/syscall-template.S:84 #1 0x000055902eaeae8c in mdb_env_close1 (env=env@entry=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4756 #2 0x000055902eaeb578 in mdb_env_close1 (env=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4779 #3 mdb_env_close (env=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4778 #4 0x000055902ea69114 in mdb_db_close (be=0x55902faeb300, cr=<optimized out>) at init.c:340 #5 0x000055902ea3815b in over_db_close (be=0x55902faeb300, cr=0x0) at backover.c:182 #6 0x000055902e9d8c0b in backend_shutdown (be=0x55902faeb300, be@entry=0x0) at backend.c:376 #7 0x000055902e9fa258 in slap_shutdown (be=be@entry=0x0) at init.c:232 #8 0x000055902e9af12f in main (argc=<optimized out>, argv=<optimized out>) at main.c:1027 Thread 1 (LWP 20928): #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at ../../../libraries/liblmdb/mdb.c:4050 #1 0x00007f4a9ac2a512 in __nptl_deallocate_tsd () at pthread_create.c:176 #2 0x00007f4a9ac2a767 in start_thread (arg=0x7f4a97acc700) at pthread_create.c:347 #3 0x00007f4a9a968aad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 # frame 0 #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at ../../../libraries/liblmdb/mdb.c:4050 4050 reader->mr_pid = 0; #p reader $1 = (MDB_reader *) 0x7f4a9c71c080 #p *reader Cannot access memory at address 0x7f4a9c71c080 ========================== === The proposed patch: ========================== diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c index 6bdf3151d..56212151b 100644 --- a/libraries/liblmdb/mdb.c +++ b/libraries/liblmdb/mdb.c @@ -4692,6 +4692,11 @@ mdb_env_close0(MDB_env *env, int excl) if (env->me_flags & MDB_ENV_TXKEY) { pthread_key_delete(env->me_txkey); + + // No need to call desctructor anymore, as all pid + // values are cleared below. + env->me_txkey = NULL; + #ifdef _WIN32 /* Delete our key from the global list */ for (i=0; i<mdb_tls_nkeys; i++)
ckowalczyk@suse.com wrote: > Full_Name: Chris Kowalczyk > Version: 2.4.41 > OS: SLE12 > URL: ftp://ftp.openldap.org/incoming/ > Submission from: (NULL) (2620:113:80c0:5::2222) > > > Hello, > I have been experiencing segfaults in mdb_env_reader_dest happening randomly, > only from time to time, during shutting off openldap. > > After some investigation I narrowed the problem down to mdb_env_reader_dest > destructor function being called after its object had been deleted while > powering off the application. mdb_env_reader_dest was called automatically, > because it was agginned to the object in pthread_key_create call: > https://github.com/openldap/openldap/blob/master/libraries/liblmdb/mdb.c#L4722 > > The problem was not happening often, and always during powering off, so no data > was lost etc. However, due to its annoyance, I prepared a simple patch, which > prevents mdb_env_reader_dest function being called if the object was already > deleted. The patch works fine, I haven't seen the problem since it was > introduced. > > I am using OpenLDAP 2.4.41, but the code seems to be the same in other versions > as well. Would you like me to create a pull request for it? We've discussed this extensively in private emails. There is no documented reason why the provided patch would have the claimed effect. Deferring this until a valid explanation for why this patch is correct can be shown. > > ========================== > === The backtrace: > ========================== > > Program terminated with signal 11, Segmentation fault. > #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at > ../../../libraries/liblmdb/mdb.c:4050 > 4050 reader->mr_pid = 0; > > #thread apply all bt > > Thread 2 (LWP 20920): > #0 0x00007f4a9ac31d9d in close () at ../sysdeps/unix/syscall-template.S:84 > #1 0x000055902eaeae8c in mdb_env_close1 (env=env@entry=0x55902fce50e0) > at ../../../libraries/liblmdb/mdb.c:4756 > #2 0x000055902eaeb578 in mdb_env_close1 (env=0x55902fce50e0) at > ../../../libraries/liblmdb/mdb.c:4779 > #3 mdb_env_close (env=0x55902fce50e0) at ../../../libraries/liblmdb/mdb.c:4778 > #4 0x000055902ea69114 in mdb_db_close (be=0x55902faeb300, cr=<optimized out>) > at init.c:340 > #5 0x000055902ea3815b in over_db_close (be=0x55902faeb300, cr=0x0) at > backover.c:182 > #6 0x000055902e9d8c0b in backend_shutdown (be=0x55902faeb300, be@entry=0x0) at > backend.c:376 > #7 0x000055902e9fa258 in slap_shutdown (be=be@entry=0x0) at init.c:232 > #8 0x000055902e9af12f in main (argc=<optimized out>, argv=<optimized out>) at > main.c:1027 > > Thread 1 (LWP 20928): > #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at > ../../../libraries/liblmdb/mdb.c:4050 > #1 0x00007f4a9ac2a512 in __nptl_deallocate_tsd () at pthread_create.c:176 > #2 0x00007f4a9ac2a767 in start_thread (arg=0x7f4a97acc700) at > pthread_create.c:347 > #3 0x00007f4a9a968aad in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > # frame 0 > #0 mdb_env_reader_dest (ptr=0x7f4a9c71c080) at > ../../../libraries/liblmdb/mdb.c:4050 > 4050 reader->mr_pid = 0; > #p reader > $1 = (MDB_reader *) 0x7f4a9c71c080 > #p *reader > Cannot access memory at address 0x7f4a9c71c080 > > ========================== > === The proposed patch: > ========================== > > diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c > index 6bdf3151d..56212151b 100644 > --- a/libraries/liblmdb/mdb.c > +++ b/libraries/liblmdb/mdb.c > @@ -4692,6 +4692,11 @@ mdb_env_close0(MDB_env *env, int excl) > > if (env->me_flags & MDB_ENV_TXKEY) { > pthread_key_delete(env->me_txkey); > + > + // No need to call desctructor anymore, as all pid > + // values are cleared below. > + env->me_txkey = NULL; > + > #ifdef _WIN32 > /* Delete our key from the global list */ > for (i=0; i<mdb_tls_nkeys; i++) > > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
The patch is a no-op.
changed notes changed state Open to Feedback