Full_Name: Rolf Stalder Version: 2.4.43 OS: Solaris 11.2 URL: https://www.gulag.ch/www/download/0001-Solaris-robust-mutex-fix.patch Submission from: (NULL) (217.162.126.5) The provided patch (against 2.4.43) might solve a problem with the new robust POSIX mutex implementation of LMDB 0.9.17 on Solaris 10/11.x. On Solaris 11 slapd fails to open the database: -- mdb_db_open: database "x": dbenv_open(y). mdb_db_open: database "x" cannot be opened: z (N). Restore from backup! backend_startup_one (type=mdb, suffix="x"): bi_db_open failed! (N) slapd shutdown: initiated -- Values of 16 (EBUSY) and 22 (EINVAL) have been observed for N. The following notes can be found in pthread_mutexattr_setrobust(3C) ( https://docs.oracle.com/cd/E36784_01/html/E36874/pthread-mutexattr-setrobust-3c. html ): -- The mutex memory must be zeroed before first initialization of a mutex with the PTHREAD_MUTEX_ROBUST attribute. Any thread in any process interested in the robust lock can call pthread_mutex_init() to potentially initialize it, provided that all such callers of pthread_mutex_init() specify the same set of attributes in their attribute structures. In this situation, if pthread_mutex_init() is called on a pre- viously initialized robust mutex, it will not reinitialize the mutex and will return the error value EBUSY. If pthread_mutex_init() is called on a previously initialized robust mutex, and if the caller specifies a different set of attributes from those already in effect for the mutex, it will not reinitialize the mutex and will return the error value EINVAL. -- Solaris 10 only provides the *_np functions (2.4.43 build/compilation fails altogether) but the above meaning of EBUSY/EINVAL applies as well. This patch has only be tested on Solaris 10/11.2 with Solaris Studio 12.3.
Hi I'm fully aware of the fact that Solaris is not the most widely-used platform on earth... Is there any chance to get this patch evaluated? I'm glad to help with whatever is needed. Thank you Rolf On 12/20/15 02:54 AM, openldap-its@OpenLDAP.org wrote: > *** THIS IS AN AUTOMATICALLY GENERATED REPLY *** > > Thanks for your report to the OpenLDAP Issue Tracking System. Your > report has been assigned the tracking number ITS#8339. > > One of our support engineers will look at your report in due course. > Note that this may take some time because our support engineers > are volunteers. They only work on OpenLDAP when they have spare > time. > > If you need to provide additional information in regards to your > issue report, you may do so by replying to this message. Note that > any mail sent to openldap-its@openldap.org with (ITS#8339) > in the subject will automatically be attached to the issue report. > > mailto:openldap-its@openldap.org?subject=(ITS#8339) > > You may follow the progress of this report by loading the following > URL in a web browser: > http://www.OpenLDAP.org/its/index.cgi?findid=8339 > > Please remember to retain your issue tracking number (ITS#8339) > on any further messages you send to us regarding this report. If > you don't then you'll just waste our time and yours because we > won't be able to properly track the report. > > Please note that the Issue Tracking System is not intended to > be used to seek help in the proper use of OpenLDAP Software. > Such requests will be closed. > > OpenLDAP Software is user supported. > http://www.OpenLDAP.org/support/ > > -------------- > Copyright 1998-2007 The OpenLDAP Foundation, All Rights Reserved. >
openldap@gulag.ch wrote: > Hi > > I'm fully aware of the fact that Solaris is not the most widely-used > platform on earth... Is there any chance to get this patch evaluated? > I'm glad to help with whatever is needed. My last SPARC/Solaris dev machine was decommissioned sometime last year so I haven't had any system available to test this on. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On 06/02/16 01:48 AM, Howard Chu wrote: > My last SPARC/Solaris dev machine was decommissioned sometime last year > so I haven't had any system available to test this on. Hi Howard Thanks for the reply. Maybe I can help out and setup two systems for you (accessible via Internet): -Solaris 11.3 Intel with Solaris Studio 12.3/4 -Solaris 10 Intel with Solaris Studio 12.3/4 We already use this patch (applied to LMDB 0.9.18) with OpenLDAP 2.4.44 and as part of our application in production on 11.3/Intel and 10/Intel+Sparc. So I could do some further testing for you on 10/Sparc (Solaris Studio 12.3/4 as well) if required. Does this sound feasible? Thank you Rolf
Rolf, please try this instead, like the manpage you referred to says under Notes. Do not catch EBUSY/EINVAL, they do mean the mutex was not (re)inited. diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c index 0209c09..5ae93bc 100644 --- a/libraries/liblmdb/mdb.c +++ b/libraries/liblmdb/mdb.c @@ -304,7 +304,7 @@ union semun { # else # define MDB_USE_ROBUST 1 /* glibc < 2.12 only provided _np API */ -# if defined(__GLIBC__) && GLIBC_VER < 0x02000c +# if (defined(__GLIBC__) && GLIBC_VER < 0x02000c) || defined(__SunOS_5_10) # define PTHREAD_MUTEX_ROBUST PTHREAD_MUTEX_ROBUST_NP # define pthread_mutexattr_setrobust(attr, flag) pthread_mutexattr_setrobust_np(attr, flag) # define pthread_mutex_consistent(mutex) pthread_mutex_consistent_np(mutex) @@ -4963,6 +4963,13 @@ mdb_env_setup_locks(MDB_env *env, char *lpath, int mode, int *excl) #else /* MDB_USE_POSIX_MUTEX: */ pthread_mutexattr_t mattr; + /* Solaris needs this before initing a robust mutex. Otherwise + * it may skip the init and return EBUSY "seems someone already + * inited" or EINVAL "it was inited differently". + */ + memset(env->me_txns->mti_rmutex, 0, sizeof(*env->me_txns->mti_rmutex)); + memset(env->me_txns->mti_wmutex, 0, sizeof(*env->me_txns->mti_wmutex)); + if ((rc = pthread_mutexattr_init(&mattr)) || (rc = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED)) #ifdef MDB_ROBUST_SUPPORTED
Hi Hallvard Thanks for the hint. Your are perfectly right, the mutex memory has to be zeroed in case of a robust mutex. Applied against 0.9.18 and it seems to work. My proposed patch is broken. There is a slight inconsistency if subsequent calls after a successfull pthread_mutexattr_init() fail (in the ORed if statement). This might result in a memory leak as the mutex attributes are not destroyed. But this is not related to the topic of this ITS. Thank you Rolf On 06/02/16 07:56 AM, Hallvard Breien Furuseth wrote: > Rolf, please try this instead, like the manpage you referred to > says under Notes. Do not catch EBUSY/EINVAL, they do mean the > mutex was not (re)inited. > > diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c > index 0209c09..5ae93bc 100644 > --- a/libraries/liblmdb/mdb.c > +++ b/libraries/liblmdb/mdb.c > @@ -304,7 +304,7 @@ union semun { > # else > # define MDB_USE_ROBUST 1 > /* glibc < 2.12 only provided _np API */ > -# if defined(__GLIBC__) && GLIBC_VER < 0x02000c > +# if (defined(__GLIBC__) && GLIBC_VER < 0x02000c) || defined(__SunOS_5_10) > # define PTHREAD_MUTEX_ROBUST PTHREAD_MUTEX_ROBUST_NP > # define pthread_mutexattr_setrobust(attr, flag) pthread_mutexattr_setrobust_np(attr, flag) > # define pthread_mutex_consistent(mutex) pthread_mutex_consistent_np(mutex) > @@ -4963,6 +4963,13 @@ mdb_env_setup_locks(MDB_env *env, char *lpath, int mode, int *excl) > #else /* MDB_USE_POSIX_MUTEX: */ > pthread_mutexattr_t mattr; > > + /* Solaris needs this before initing a robust mutex. Otherwise > + * it may skip the init and return EBUSY "seems someone already > + * inited" or EINVAL "it was inited differently". > + */ > + memset(env->me_txns->mti_rmutex, 0, sizeof(*env->me_txns->mti_rmutex)); > + memset(env->me_txns->mti_wmutex, 0, sizeof(*env->me_txns->mti_wmutex)); > + > if ((rc = pthread_mutexattr_init(&mattr)) > || (rc = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED)) > #ifdef MDB_ROBUST_SUPPORTED >
openldap@gulag.ch wrote: > Hi Hallvard > > Thanks for the hint. Your are perfectly right, the mutex memory has to > be zeroed in case of a robust mutex. Applied against 0.9.18 and it seems > to work. My proposed patch is broken. I've pushed a tweaked version of Hallvard's patch. While the Studio compiler defines __SunOS_5_10 it appears that gcc doesn't, so we took a different detection route. > > There is a slight inconsistency if subsequent calls after a successfull > pthread_mutexattr_init() fail (in the ORed if statement). This might > result in a memory leak as the mutex attributes are not destroyed. But > this is not related to the topic of this ITS. Feel free to submit another ITS for that, thanks. > > Thank you > > Rolf > > On 06/02/16 07:56 AM, Hallvard Breien Furuseth wrote: >> Rolf, please try this instead, like the manpage you referred to >> says under Notes. Do not catch EBUSY/EINVAL, they do mean the >> mutex was not (re)inited. >> >> diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c >> index 0209c09..5ae93bc 100644 >> --- a/libraries/liblmdb/mdb.c >> +++ b/libraries/liblmdb/mdb.c >> @@ -304,7 +304,7 @@ union semun { >> # else >> # define MDB_USE_ROBUST 1 >> /* glibc < 2.12 only provided _np API */ >> -# if defined(__GLIBC__) && GLIBC_VER < 0x02000c >> +# if (defined(__GLIBC__) && GLIBC_VER < 0x02000c) || defined(__SunOS_5_10) >> # define PTHREAD_MUTEX_ROBUST PTHREAD_MUTEX_ROBUST_NP >> # define pthread_mutexattr_setrobust(attr, flag) pthread_mutexattr_setrobust_np(attr, flag) >> # define pthread_mutex_consistent(mutex) pthread_mutex_consistent_np(mutex) >> @@ -4963,6 +4963,13 @@ mdb_env_setup_locks(MDB_env *env, char *lpath, int mode, int *excl) >> #else /* MDB_USE_POSIX_MUTEX: */ >> pthread_mutexattr_t mattr; >> >> + /* Solaris needs this before initing a robust mutex. Otherwise >> + * it may skip the init and return EBUSY "seems someone already >> + * inited" or EINVAL "it was inited differently". >> + */ >> + memset(env->me_txns->mti_rmutex, 0, sizeof(*env->me_txns->mti_rmutex)); >> + memset(env->me_txns->mti_wmutex, 0, sizeof(*env->me_txns->mti_wmutex)); >> + >> if ((rc = pthread_mutexattr_init(&mattr)) >> || (rc = pthread_mutexattr_setpshared(&mattr, PTHREAD_PROCESS_SHARED)) >> #ifdef MDB_ROBUST_SUPPORTED >> > > > > > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
openldap@gulag.ch wrote: > On 06/02/16 01:48 AM, Howard Chu wrote: >> My last SPARC/Solaris dev machine was decommissioned sometime last year >> so I haven't had any system available to test this on. > > Hi Howard > > Thanks for the reply. Maybe I can help out and setup two systems for you > (accessible via Internet): > > -Solaris 11.3 Intel with Solaris Studio 12.3/4 > -Solaris 10 Intel with Solaris Studio 12.3/4 > > We already use this patch (applied to LMDB 0.9.18) with OpenLDAP 2.4.44 and as > part of our application in production on 11.3/Intel and 10/Intel+Sparc. So I > could do some further testing for you on 10/Sparc (Solaris Studio 12.3/4 as > well) if required. > > Does this sound feasible? Thanks for the offer, perhaps in the future. I just now acquired access to a Solaris VPS so we can handle this issue for now. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
On 06/02/16 10:15 PM, Howard Chu wrote: > I've pushed a tweaked version of Hallvard's patch. While the Studio > compiler defines __SunOS_5_10 it appears that gcc doesn't, so we took a > different detection route. Perfect and the detection clearly works. Thanks for reviewing this ITS. --- pthread.h S11 --- /* * Mutex robust attribute values. * Keep these in synch with sys/synch.h lock types. */ #define PTHREAD_MUTEX_STALLED 0x0 #define PTHREAD_MUTEX_ROBUST 0x40 /* * Historical solaris-specific names, * from before pthread_mutexattr_getrobust() became standardized */ #define PTHREAD_MUTEX_STALL_NP PTHREAD_MUTEX_STALLED #define PTHREAD_MUTEX_ROBUST_NP PTHREAD_MUTEX_ROBUST --------------------- > uname -r 5.11 > nm liblmdb.so |grep setrobust [161] | 0| 0|FUNC |GLOB |0 |UNDEF |pthread_mutexattr_setrobust --- pthread.h S10 --- /* * Mutex robustness attribute values. The robustness attribute is a * Solaris specific extension to support robust mutexes. Note the _NP suffix * to indicate these are not part of the current POSIX spec (POSIX 1003.1 1996), * but are platform specific non-portable extensions. Keep these in synch * with sys/synch.h lock types. */ #define PTHREAD_MUTEX_STALL_NP 0x0 #define PTHREAD_MUTEX_ROBUST_NP 0x40 --------------------- > uname -r 5.10 > nm liblmdb.so |grep setrobust [245] | 0| 0|FUNC |GLOB |0 |UNDEF |pthread_mutexattr_setrobust_np
changed notes changed state Open to Release
On 06/02/16 10:15 PM, Howard Chu wrote: >> >> There is a slight inconsistency if subsequent calls after a successfull >> pthread_mutexattr_init() fail (in the ORed if statement). This might >> result in a memory leak as the mutex attributes are not destroyed. But >> this is not related to the topic of this ITS. > > Feel free to submit another ITS for that, thanks. Has just been fixed by Hallvard: http://www.openldap.org/devel/gitweb.cgi?p=openldap.git;a=commitdiff;h=c4c7833d245fdc4b2ea4a1a459d7069ce3af87f5 Thanks!
fixed in master fixed in 0.9.19
changed notes changed state Release to Closed moved from Incoming to Software Bugs