Full_Name: Salvador Ortiz Version: 24 OS: Linux URL: ftp://ftp.msg.com.mx/pub/varios/0001-In-mdb_env_copy-retry-open-if-O_DIRECT-fails.patch Submission from: (NULL) (187.162.45.111) If the OS defines O_DIRECT mdb_env_copy unconditionally uses it, but according to open(2): "Some file systems may not implement the flag and open() will fail with EINVAL if it is used." In this cases mdb_env_copy should retry without the flag. The attached patch implements that.
sog@msg.com.mx wrote: > Full_Name: Salvador Ortiz > Version: 24 > OS: Linux > URL: ftp://ftp.msg.com.mx/pub/varios/0001-In-mdb_env_copy-retry-open-if-O_DIRECT-fails.patch > Submission from: (NULL) (187.162.45.111) > > > If the OS defines O_DIRECT mdb_env_copy unconditionally uses it, but according > to open(2): "Some file systems may not implement the flag and open() will > fail with EINVAL if it is used." > > In this cases mdb_env_copy should retry without the flag. > > The attached patch implements that. > > Thanks, applied. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
changed notes changed state Open to Test moved from Incoming to Software Bugs
On 09/05/2013 06:33 PM, Howard Chu wrote: > sog@msg.com.mx wrote: >> Full_Name: Salvador Ortiz >> Version: 24 >> OS: Linux >> URL: >> ftp://ftp.msg.com.mx/pub/varios/0001-In-mdb_env_copy-retry-open-if-O_DIRECT-fails.patch >> Submission from: (NULL) (187.162.45.111) >> >> >> If the OS defines O_DIRECT mdb_env_copy unconditionally uses it, but >> according >> to open(2): "Some file systems may not implement the flag and >> open() will >> fail with EINVAL if it is used." >> >> In this cases mdb_env_copy should retry without the flag. >> >> The attached patch implements that. >> >> > Thanks, applied. > Thank you Howard, but I found an unexpected problem with the patch as commited. If the call fails because O_DIRECT, in a "tmpfs" in my Fedora 19 '/tmp', the file is left created anyway! so the retry fails because O_EXCL|O_CREAT. I consider this undocumented behavior for "open", buggy or at least ugly. (A simple test program in ftp://ftp.msg.com.mx/pub/varios/o_direct.c) I propose the following: ======== --- a/liblmdb/mdb.c 2013-09-05 17:31:01.498090281 -0500 +++ b/liblmdb/mdb.c 2013-09-13 14:45:00.918717292 -0500 @@ -4250,9 +4250,16 @@ /* The OS supports O_DIRECT, try with it */ newfd = open(lpath, O_WRONLY|O_CREAT|O_EXCL|O_DIRECT, 0666); /* But open can fail if O_DIRECT isn't supported by the file system - * so retry without the flag + * so retry without the flag. + * + * Sept 13 2013. + * On a tmpfs, at least, an open with O_DIRECT fails, + * BUT the file is created anyway!! + * Fortunately seems that other errors are reported before EINVAL + * So, we need to remove it before retry open. */ if (newfd == INVALID_HANDLE_VALUE && ErrCode() == EINVAL) + (void)unlink(lpath) , #endif newfd = open(lpath, O_WRONLY|O_CREAT|O_EXCL, 0666); #endif ========= What do you think? Regards.
On 2013-09-13 23:09, sog@msg.com.mx wrote: > If the call fails because O_DIRECT, in a "tmpfs" in my Fedora 19 > '/tmp', > the file is left created anyway! so the retry fails because > O_EXCL|O_CREAT. Also on RHEL 6.4 (2.6.32-358.18.1.el6.x86_64) with /dev/shm/nosuch. I'll report it to RedHat and see what they say. > I propose the following: > (...) > + * Fortunately seems that other errors are reported before EINVAL > + * So, we need to remove it before retry open. > */ > if (newfd == INVALID_HANDLE_VALUE && ErrCode() == EINVAL) > + (void)unlink(lpath) , No unlink(). EINVAL does not say whether the file existed before the call. Not on another OS, nor on another Linux version or filesystem where Linux may work differently. Somewhat better: #ifdef <buggy Linux>, replace O_EXCL with O_TRUNC in the 2nd call. Or add paranoia like taking an exclusive lock, but ignore "locking not supported" errors. > #endif > newfd = open(lpath, O_WRONLY|O_CREAT|O_EXCL, 0666); > #endif -- Hallvard
changed notes changed state Test to Open
I wrote: > Also on RHEL 6.4 (2.6.32-358.18.1.el6.x86_64) with > /dev/shm/nosuch. I'll report it to RedHat and see what they say. https://bugzilla.redhat.com/show_bug.cgi?id=1008073 Simpler fix, at least on Linux: #ifdef O_DIRECT if ((rc = fcntl(fd, F_GETFL)) != -1) (void) fcntl(fd, F_SETFL, rc | O_DIRECT); #endif But that could fail elsewhere, if some OS only accepts O_DIRECT to open() and not to fcntl(). BTW, does the nearby Apple code need to fail if fcntl(newfd, F_NOCACHE, 1) fails? Could ignore errors. -- Hallvard
changed notes changed state Open to Partial
This was fixed in LMDB 0.9.9, except: If both O_DIRECT and F_NOCACHE are #defined, both are used. Looks a bit strange. Maybe just one should be used in that case. (Which?) Also I'm still wondering if env_open needs to fail if F_NOCACHE fails. Anyway, maybe it should be something like this. The "else {}" is in case O_DIRECT is an unsupported dummy definiton: #ifdef O_DIRECT if ((rc = fcntl(newfd, F_GETFL)) != -1) (void) fcntl(newfd, F_SETFL, rc | O_DIRECT); else #endif { #ifdef F_NOCACHE /* __APPLE__ */ (void) fcntl(newfd, F_NOCACHE, 1); #endif } -- Hallvard
I wrote: > still wondering if env_open needs to fail if F_NOCACHE fails. Oops - env_copy, not env_open. -- Hallvard
h.b.furuseth@usit.uio.no wrote: > This was fixed in LMDB 0.9.9, except: If both O_DIRECT and > F_NOCACHE are #defined, both are used. Looks a bit strange. > Maybe just one should be used in that case. (Which?) Also I'm > still wondering if env_open needs to fail if F_NOCACHE fails. > > Anyway, maybe it should be something like this. The "else {}" > is in case O_DIRECT is an unsupported dummy definiton: Isn't that in the wrong order then? What you suggest here will use O_DIRECT even if it's a dummy. I don't see any reason to fail the operation. They just want to make a copy/backup; even if the optimization of avoiding pollution of the buffer cache can't be done, they still want the copy to happen. > #ifdef O_DIRECT > if ((rc = fcntl(newfd, F_GETFL)) != -1) > (void) fcntl(newfd, F_SETFL, rc | O_DIRECT); > else > #endif > { > #ifdef F_NOCACHE /* __APPLE__ */ > (void) fcntl(newfd, F_NOCACHE, 1); > #endif > } > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Sorry, looks like I dropped this message. Yes, #ifdef F_NOCACHE / #elif defined O_DIRECT / #endif seems better. Included in branch "mdb/fopen" in my repo at UiO. On 24. des. 2013 18:18, Howard Chu wrote: > h.b.furuseth@usit.uio.no wrote: >> This was fixed in LMDB 0.9.9, except: If both O_DIRECT and >> F_NOCACHE are #defined, both are used. Looks a bit strange. >> Maybe just one should be used in that case. (Which?) Also I'm >> still wondering if env_open needs to fail if F_NOCACHE fails. >> >> Anyway, maybe it should be something like this. The "else {}" >> is in case O_DIRECT is an unsupported dummy definiton: > > Isn't that in the wrong order then? What you suggest here will use O_DIRECT > even if it's a dummy. > > I don't see any reason to fail the operation. They just want to make a > copy/backup; even if the optimization of avoiding pollution of the buffer > cache can't be done, they still want the copy to happen. > >> #ifdef O_DIRECT >> if ((rc = fcntl(newfd, F_GETFL)) != -1) >> (void) fcntl(newfd, F_SETFL, rc | O_DIRECT); >> else >> #endif >> { >> #ifdef F_NOCACHE /* __APPLE__ */ >> (void) fcntl(newfd, F_NOCACHE, 1); >> #endif >> } -- Hallvard
changed notes changed state Partial to Test
Fixed in mdb.master Fixed in 0.9.19
changed notes changed state Test to Closed