Issue 8106 - Retry pwrite operations in case of EINTR
Summary: Retry pwrite operations in case of EINTR
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-17 08:43 UTC by milinevskyy@gmail.com
Modified: 2015-07-02 17:50 UTC (History)
0 users

See Also:


Attachments
0001-libmdb-retry-write-operation-in-case-of-interrupted-.patch (2.31 KB, patch)
2015-04-17 08:46 UTC, milinevskyy@gmail.com
Details

Note You need to log in before you can comment on or make changes to this issue.
Description milinevskyy@gmail.com 2015-04-17 08:43:22 UTC
Full_Name: Dmytro Milinevskyy
Version: 
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (92.132.131.166)


On linux systems it's common to have system calls interrupted and restarted. 
LMDB should retry pwrites to avoid "false" failures.


Comment 1 milinevskyy@gmail.com 2015-04-17 08:46:03 UTC
Proposed patch in attachment.

On Fri, Apr 17, 2015 at 10:43 AM, <openldap-its@openldap.org> wrote:

>
> *** THIS IS AN AUTOMATICALLY GENERATED REPLY ***
>
> Thanks for your report to the OpenLDAP Issue Tracking System.  Your
> report has been assigned the tracking number ITS#8106.
>
> One of our support engineers will look at your report in due course.
> Note that this may take some time because our support engineers
> are volunteers.  They only work on OpenLDAP when they have spare
> time.
>
> If you need to provide additional information in regards to your
> issue report, you may do so by replying to this message.  Note that
> any mail sent to openldap-its@openldap.org with (ITS#8106)
> in the subject will automatically be attached to the issue report.
>
>         mailto:openldap-its@openldap.org?subject=(ITS#8106)
>
> You may follow the progress of this report by loading the following
> URL in a web browser:
>     http://www.OpenLDAP.org/its/index.cgi?findid=8106
>
> Please remember to retain your issue tracking number (ITS#8106)
> on any further messages you send to us regarding this report.  If
> you don't then you'll just waste our time and yours because we
> won't be able to properly track the report.
>
> Please note that the Issue Tracking System is not intended to
> be used to seek help in the proper use of OpenLDAP Software.
> Such requests will be closed.
>
> OpenLDAP Software is user supported.
>         http://www.OpenLDAP.org/support/
>
> --------------
> Copyright 1998-2007 The OpenLDAP Foundation, All Rights Reserved.
>
>
Comment 2 milinevskyy@gmail.com 2015-04-17 08:47:55 UTC
Proposed patch in message body as issue tracker doesn't recognizes
attachments:

commit c9857a7c0df4448a14a2aa9f55850df5e830069e
Author: Dmytro Milinevskyy <milinevskyy@gmail.com>
Date:   Fri Apr 17 10:41:26 2015 +0200

    libmdb: retry write operation in case of interrupted system

diff --git a/libraries/liblmdb/mdb.c b/libraries/liblmdb/mdb.c
index a95c7cd..ed7f370 100644
--- a/libraries/liblmdb/mdb.c
+++ b/libraries/liblmdb/mdb.c
@@ -3236,6 +3236,7 @@ mdb_page_flush(MDB_txn *txn, int keep)
                /* Write up to MDB_COMMIT_PAGES dirty pages at a time. */
                if (pos!=next_pos || n==MDB_COMMIT_PAGES ||
wsize+size>MAX_WRITE) {
                        if (n) {
+              retry_write:
                                /* Write previous page(s) */
 #ifdef MDB_USE_PWRITEV
                                wres = pwritev(env->me_fd, iov, n, wpos);
@@ -3243,8 +3244,10 @@ mdb_page_flush(MDB_txn *txn, int keep)
                                if (n == 1) {
                                        wres = pwrite(env->me_fd,
iov[0].iov_base, wsize, wpos);
                                } else {
+                                 retry_seek:
                                        if (lseek(env->me_fd, wpos,
SEEK_SET) == -1) {
                                                rc = ErrCode();
+                                               if (EINTR == rc) goto
retry_seek;
                                                DPRINTF(("lseek: %s",
strerror(rc)));
                                                return rc;
                                        }
@@ -3254,6 +3257,7 @@ mdb_page_flush(MDB_txn *txn, int keep)
                                if (wres != wsize) {
                                        if (wres < 0) {
                                                rc = ErrCode();
+                                               if (EINTR == rc) goto
retry_write;
                                                DPRINTF(("Write error: %s",
strerror(rc)));
                                        } else {
                                                rc = EIO; /* TODO: Use
which error code? */
@@ -3627,7 +3631,8 @@ mdb_env_init_meta(MDB_env *env, MDB_meta *meta)
        int len;
 #define DO_PWRITE(rc, fd, ptr, size, len, pos) do { \
        len = pwrite(fd, ptr, size, pos);       \
-       rc = (len >= 0); } while(0)
+       if (len == -1 && EINVAL == ErrCode()) continue; \
+       rc = (len >= 0); break; } while(1)
 #endif

        DPUTS("writing new meta page");
@@ -3735,6 +3740,7 @@ mdb_env_write_meta(MDB_txn *txn)

        /* Write to the SYNC fd */
        mfd = (flags & (MDB_NOSYNC|MDB_NOMETASYNC)) ? env->me_fd :
env->me_mfd;
+retry_write:
 #ifdef _WIN32
        {
                memset(&ov, 0, sizeof(ov));
@@ -3747,6 +3753,7 @@ mdb_env_write_meta(MDB_txn *txn)
 #endif
        if (rc != len) {
                rc = rc < 0 ? ErrCode() : EIO;
+               if (EINTR == rc) goto retry_write;
                DPUTS("write failed, disk error?");
                /* On a failure, the pagecache still contains the new data.
                 * Write some old data back, to prevent it from being used.

On Fri, Apr 17, 2015 at 10:46 AM, Dmytro Milinevskyy <milinevskyy@gmail.com>
wrote:

> Proposed patch in attachment.
>
> On Fri, Apr 17, 2015 at 10:43 AM, <openldap-its@openldap.org> wrote:
>
>>
>> *** THIS IS AN AUTOMATICALLY GENERATED REPLY ***
>>
>> Thanks for your report to the OpenLDAP Issue Tracking System.  Your
>> report has been assigned the tracking number ITS#8106.
>>
>> One of our support engineers will look at your report in due course.
>> Note that this may take some time because our support engineers
>> are volunteers.  They only work on OpenLDAP when they have spare
>> time.
>>
>> If you need to provide additional information in regards to your
>> issue report, you may do so by replying to this message.  Note that
>> any mail sent to openldap-its@openldap.org with (ITS#8106)
>> in the subject will automatically be attached to the issue report.
>>
>>         mailto:openldap-its@openldap.org?subject=(ITS#8106)
>>
>> You may follow the progress of this report by loading the following
>> URL in a web browser:
>>     http://www.OpenLDAP.org/its/index.cgi?findid=8106
>>
>> Please remember to retain your issue tracking number (ITS#8106)
>> on any further messages you send to us regarding this report.  If
>> you don't then you'll just waste our time and yours because we
>> won't be able to properly track the report.
>>
>> Please note that the Issue Tracking System is not intended to
>> be used to seek help in the proper use of OpenLDAP Software.
>> Such requests will be closed.
>>
>> OpenLDAP Software is user supported.
>>         http://www.OpenLDAP.org/support/
>>
>> --------------
>> Copyright 1998-2007 The OpenLDAP Foundation, All Rights Reserved.
>>
>>
>
Comment 3 Howard Chu 2015-04-17 17:34:55 UTC
milinevskyy@gmail.com wrote:
> Full_Name: Dmytro Milinevskyy
> Version:
> OS: Linux
> URL: ftp://ftp.openldap.org/incoming/
> Submission from: (NULL) (92.132.131.166)
>
>
> On linux systems it's common to have system calls interrupted and restarted.
> LMDB should retry pwrites to avoid "false" failures.

In practice, writes to a file are "fast" and will always write at least 
one byte. But we've patched this as suggested, thanks.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 4 Howard Chu 2015-04-17 17:39:02 UTC
changed notes
changed state Open to Test
moved from Incoming to Software Bugs
Comment 5 Howard Chu 2015-04-17 17:39:17 UTC
changed notes
Comment 6 milinevskyy@gmail.com 2015-04-17 20:19:20 UTC
Hello,
we've encountered this problem mainly on Android device. The kernel sends a
fake signal and the baking storage(flash) is not so fast.
I've noticed a typo in my patch for the macro DO_PWRITE in
mdb_env_init_meta. The test should be done for EINTR, not EINVAL.
Do you want me to resend the patch?

Thanks,
Dmytro

On Fri, Apr 17, 2015 at 7:34 PM, Howard Chu <hyc@symas.com> wrote:

> milinevskyy@gmail.com wrote:
>
>> Full_Name: Dmytro Milinevskyy
>> Version:
>> OS: Linux
>> URL: ftp://ftp.openldap.org/incoming/
>> Submission from: (NULL) (92.132.131.166)
>>
>>
>> On linux systems it's common to have system calls interrupted and
>> restarted.
>> LMDB should retry pwrites to avoid "false" failures.
>>
>
> In practice, writes to a file are "fast" and will always write at least
> one byte. But we've patched this as suggested, thanks.
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.           http://www.symas.com
>   Director, Highland Sun     http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
>
Comment 7 Howard Chu 2015-04-18 03:37:26 UTC
Dmytro Milinevskyy wrote:
> Hello,
> we've encountered this problem mainly on Android device. The kernel
> sends a fake signal and the baking storage(flash) is not so fast.
> I've noticed a typo in my patch for the macro DO_PWRITE in
> mdb_env_init_meta. The test should be done for EINTR, not EINVAL.
> Do you want me to resend the patch?

Yes I already noticed that, so I wrote the patch myself instead of 
applying yours.

Thank you for the context, that's good information.
>
> Thanks,
> Dmytro
>
> On Fri, Apr 17, 2015 at 7:34 PM, Howard Chu <hyc@symas.com
> <mailto:hyc@symas.com>> wrote:
>
>     milinevskyy@gmail.com <mailto:milinevskyy@gmail.com> wrote:
>
>         Full_Name: Dmytro Milinevskyy
>         Version:
>         OS: Linux
>         URL: ftp://ftp.openldap.org/incoming/
>         Submission from: (NULL) (92.132.131.166)
>
>
>         On linux systems it's common to have system calls interrupted
>         and restarted.
>         LMDB should retry pwrites to avoid "false" failures.
>
>
>     In practice, writes to a file are "fast" and will always write at
>     least one byte. But we've patched this as suggested, thanks.
>
>     --
>        -- Howard Chu
>        CTO, Symas Corp. http://www.symas.com
>        Director, Highland Sun http://highlandsun.com/hyc/
>        Chief Architect, OpenLDAP http://www.openldap.org/project/
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 8 Quanah Gibson-Mount 2015-04-23 17:50:34 UTC
changed notes
changed state Test to Release
Comment 9 OpenLDAP project 2015-07-02 17:50:29 UTC
fixed in mdb.master
fixed in mdb.0.9
Comment 10 Quanah Gibson-Mount 2015-07-02 17:50:29 UTC
changed notes
changed state Release to Closed