Issue 8504 - LDMB: Return EPIPE from mdb_env_copyfd2 instead abort on SIGPIPE
Summary: LDMB: Return EPIPE from mdb_env_copyfd2 instead abort on SIGPIPE
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-23 22:36 UTC by lmb@cloudflare.com
Modified: 2018-02-09 18:58 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description lmb@cloudflare.com 2016-09-23 22:36:09 UTC
Full_Name: Lorenz Bauer
Version: 
OS: 
URL: https://gist.github.com/lmb/63189f83c5f00dd59c86f3c9bc07694d
Submission from: (NULL) (2606:4700:1000:8200:5530:fc35:2f95:4f31)


As per the discussion in [1] I'm providing a patch to block and handle SIGPIPE
in the copy thread used when performing a compacting copy.

The attached file is derived from OpenLDAP Software. All of the modifications to
OpenLDAP Software represented in the following patch(es) were developed by
CloudFlare, Inc. CloudFlare, Inc. has not assigned rights and/or interest in
this work to any party. I, Lorenz Bauer am authorized by CloudFlare, Inc., my
employer, to release this work under the following terms.
CloudFlare, Inc. hereby place the following modifications to OpenLDAP Software
(and only these modifications) into the public domain. Hence, these
modifications may be freely used and/or redistributed for any purpose with or
without attribution and/or other notice.

1: http://www.openldap.org/lists/openldap-technical/201609/msg00026.html
Comment 1 lmb@cloudflare.com 2016-09-25 18:16:42 UTC
I've updated the patch to be much simpler:

* Set pthread_sigmask in copythr and return an error in my->mc_error
* Do not check for pending SIGPIPE: since this is a new thread there can't
be any

The URL stays the same.
Comment 2 Hallvard Furuseth 2016-09-27 03:48:38 UTC
On 25/09/16 20:16, lmb@cloudflare.com wrote:
> * Do not check for pending SIGPIPE: since this is a new thread there can't
> be any

I think we can skip sigwait() too since the thread will exit when done.
Then any pending signals are presumably dropped.  (I suppose some OS
could give them to another thread, but that goes against delivering
them to a particular thread like SIGPIPE is.)

Comment 3 lmb@cloudflare.com 2016-09-27 20:14:38 UTC
On 26 September 2016 at 20:48, Hallvard Breien Furuseth
<h.b.furuseth@usit.uio.no> wrote:
> I think we can skip sigwait() too since the thread will exit when done.
> Then any pending signals are presumably dropped.  (I suppose some OS
> could give them to another thread, but that goes against delivering
> them to a particular thread like SIGPIPE is.)

That's what my initial patch did. At least on OS X this leads to the
process receiving SIGPIPE and dying. I can attach a test case as well
if desired.

Comment 4 Hallvard Furuseth 2016-10-20 08:08:28 UTC
On 27/09/16 22:14, lmb@cloudflare.com wrote:
>On 26 September 2016 at 20:48, Hallvard Breien Furuseth <h.b.furuseth@usit.uio.no> wrote:
>> I think we can skip sigwait() too since the thread will exit when done.
>> (...)
>
> That's what my initial patch did. At least on OS X this leads to the
> process receiving SIGPIPE and dying.

OK.  I've put a fixed version including that comment in branch
"mdb/its8504-sigpipe" @ <git://git.uio.no/u/hbf/openldap.git>.
We can squash the commits later:

Never clear mc_error, we'd lose failure in the other thread.
It's not mutex-protected, which is OK: If both threads set it,
we'll get one of the errors. LMDB already expects atomic int.

And.. duh, we've forgotten mdb_copy without MDB_CP_COMPACT.
I'll do it later if nobody beats me to it.
Test:  Comment out SIGPIPE in mdb_copy.c, then run

   bash$ (sleep 1; ./mdb_copy testdb; echo exit $? >&2) | true
   exit 141

Exit values >= 128 are aborts from signals.

Comment 5 Hallvard Furuseth 2016-10-20 08:35:59 UTC
I wrote:
> And.. duh, we've forgotten mdb_copy without MDB_CP_COMPACT.
> I'll do it later if nobody beats me to it.

Hmm.  Nevermind, it's probably better to leave that to the user.
It gets ugly for a library to meddle with the current thread's
signals.  I.e. it must check if SIGPIPE was already pending
before the call, and don't collect it on EPIPE in that case.

-- 
Hallvard

Comment 6 lmb@cloudflare.com 2016-10-20 08:56:59 UTC
On 20 October 2016 at 09:35, Hallvard Breien Furuseth <
h.b.furuseth@usit.uio.no> wrote:
>
> Hmm.  Nevermind, it's probably better to leave that to the user.
> It gets ugly for a library to meddle with the current thread's
> signals.  I.e. it must check if SIGPIPE was already pending
> before the call, and don't collect it on EPIPE in that case.
>

Yeah, I agree. Your changes LGTM.

Interesting point re atomic int write, I wasn't entirely sure how mc_error
could be accesses, so I decided to play it safe.

-- 
Lorenz Bauer  |  Systems Engineer
25 Lavington St., London SE1 0NZ

www.cloudflare.com
Comment 7 lmb@cloudflare.com 2016-12-07 13:06:29 UTC
Is there anything required from me to make this patch move along?

Comment 8 Howard Chu 2016-12-21 12:47:27 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 9 Hallvard Furuseth 2017-06-07 12:35:38 UTC
On 06. juni 2017 21:59, quanah@openldap.org wrote:
> /usr/include/signal.h:233:12: note: declared here
>   extern int sigwait(sigset_t *);
>              ^

POSIX says it has two arguments:
   http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigwait.html
Is there a two-argument version of sigwait in the Solaris headers somewhere,
maybe enabled by #define POSIX_C_SOURCE 1 or something like that?

-- 
Hallvard

Comment 10 Quanah Gibson-Mount 2017-06-07 15:11:23 UTC
--On Wednesday, June 07, 2017 3:35 PM +0200 Hallvard Breien Furuseth 
<h.b.furuseth@usit.uio.no> wrote:

> On 06. juni 2017 21:59, quanah@openldap.org wrote:
>> /usr/include/signal.h:233:12: note: declared here
>>   extern int sigwait(sigset_t *);
>>              ^
>
> POSIX says it has two arguments:
>    http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigwait.html
> Is there a two-argument version of sigwait in the Solaris headers
> somewhere,
> maybe enabled by #define POSIX_C_SOURCE 1 or something like that?

It's insanely long....


/*
 * sigwait() prototype is defined here.
 */

#if     defined(__EXTENSIONS__) || (!defined(_STRICT_STDC) && \
        !defined(__XOPEN_OR_POSIX)) || (_POSIX_C_SOURCE - 0 >= 199506L) || \
        defined(_POSIX_PTHREAD_SEMANTICS)

#if     defined(__STDC__)

#if     (_POSIX_C_SOURCE - 0 >= 199506L) || 
defined(_POSIX_PTHREAD_SEMANTICS)

#ifdef __PRAGMA_REDEFINE_EXTNAME
#pragma redefine_extname sigwait __posix_sigwait
extern int sigwait(const sigset_t *_RESTRICT_KYWD, int *_RESTRICT_KYWD);
#else  /* __PRAGMA_REDEFINE_EXTNAME */

extern int __posix_sigwait(const sigset_t *_RESTRICT_KYWD,
    int *_RESTRICT_KYWD);

#ifdef  __lint
#define sigwait __posix_sigwait
#else   /* !__lint */

static int
sigwait(const sigset_t *_RESTRICT_KYWD __setp, int *_RESTRICT_KYWD __signo)
{
        return (__posix_sigwait(__setp, __signo));
}

#endif /* !__lint */
#endif /* __PRAGMA_REDEFINE_EXTNAME */

#else  /* (_POSIX_C_SOURCE - 0 >= 199506L) || ... */

extern int sigwait(sigset_t *);

#endif  /* (_POSIX_C_SOURCE - 0 >= 199506L) || ... */


#else  /* __STDC__ */


#if     (_POSIX_C_SOURCE - 0 >= 199506L) || 
defined(_POSIX_PTHREAD_SEMANTICS)

#ifdef __PRAGMA_REDEFINE_EXTNAME
#pragma redefine_extname sigwait __posix_sigwait
extern int sigwait();
#else  /* __PRAGMA_REDEFINE_EXTNAME */

extern int __posix_sigwait();

#ifdef  __lint
#define sigwait __posix_sigwait
#else   /* !__lint */

static int
sigwait(__setp, __signo)
        sigset_t *__setp;
        int *__signo;
{
        return (__posix_sigwait(__setp, __signo));
}

#endif /* !__lint */
#endif /* __PRAGMA_REDEFINE_EXTNAME */

#else  /* (_POSIX_C_SOURCE - 0 >= 199506L) || ... */

extern int sigwait();

#endif /* (_POSIX_C_SOURCE - 0 >= 199506L) || ... */

#endif /* __STDC__ */

#endif /* defined(__EXTENSIONS__) || (!defined(_STRICT_STDC) ... */



--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 11 Quanah Gibson-Mount 2017-06-07 20:39:31 UTC
--On Wednesday, June 07, 2017 4:11 PM +0000 quanah@symas.com wrote:

> --On Wednesday, June 07, 2017 3:35 PM +0200 Hallvard Breien Furuseth
> <h.b.furuseth@usit.uio.no> wrote:
>
>> On 06. juni 2017 21:59, quanah@openldap.org wrote:
>>> /usr/include/signal.h:233:12: note: declared here
>>>   extern int sigwait(sigset_t *);
>>>              ^
>>
>> POSIX says it has two arguments:
>>    http://pubs.opengroup.org/onlinepubs/9699919799/functions/sigwait.html
>> Is there a two-argument version of sigwait in the Solaris headers
>> somewhere,
>> maybe enabled by #define POSIX_C_SOURCE 1 or something like that?
>
> It's insanely long....

This patch fixed it on my box:

quanah@sol11-3:~/git/sold-2445/openldap$ git diff
diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index 8a62eff..1976340 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -113,6 +113,9 @@ typedef SSIZE_T     ssize_t;
 /* Most platforms have posix_memalign, older may only have memalign */
 #define HAVE_MEMALIGN  1
 #include <malloc.h>
+#if defined (__sun)
+#define _POSIX_PTHREAD_SEMANTICS       1
+#endif
 #endif

 #if !(defined(BYTE_ORDER) || defined(__BYTE_ORDER))


--Quanah

--

Quanah Gibson-Mount
Product Architect
Symas Corporation
Packaged, certified, and supported LDAP solutions powered by OpenLDAP:
<http://www.symas.com>


Comment 12 OpenLDAP project 2018-02-09 18:58:40 UTC
fixed in mdb.master
fixed in mdb.RE/0.9 (0.9.19)
Comment 13 Quanah Gibson-Mount 2018-02-09 18:58:40 UTC
changed notes
changed state Test to Closed