[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#8579) LMDB: me_fd should be O_CLOEXEC on POSIX systems



Full_Name: Lorenz Bauer
Version: 
OS: Linux
URL: 
Submission from: (NULL) (2a06:98c0:1000:8200:18fd:e772:495e:6163)


This is in some ways a follow up to ITS#8505. In the issue I argued that LMDB
did not deal well with fork/exec. Hallvard committed a fix in
04acac634a7b276332e2. I think the fix still has one major weakness: me_fd does
not have O_CLOEXEC set.

Now, Hallvard omitted this change on purpose. mdb_env_get_fd() exposes me_fd to
the user, so there could conceivably be client code out there which expects the
fd to persist across exec(). (Please correct my if I misunderstood your
argument.)

I believe that the correct action is to set O_CLOEXEC on me_fd by default, and
break such client code.

1. Using mdb_env_get_fd this way is Undefined Behaviour (according to the doc)

Before ITS#8505, the documentation made it sound like ANY use after fork() /
exec() is disallowed. At the same time, mdb_env_get_fd made no mention /
promises wrt exec() behaviour.

Adopting O_CLOEXEC in my opinion does not violate the API set out so far. We
can, in the opposite, clarify what is acceptable use (as Hallvard already has
done).

2. mdb_env_get_fd has inconsistent semantics across OSs

When creating file handles on Windows we pass NULL as the lpSecurityAttributes
argument to CreateFileW:

    If this parameter is NULL, the handle returned by CreateFile cannot be
    inherited by any child processes the application may create [...]

me_fd already has O_CLOEXEC on Windows.

3. Manually closing me_fd after fork() is not possible in all environments

Hallvard suggests calling fork(), close(mdb_env_get_fd()) and then exec() for
users which do not want to leak me_fd to child processes.

We use LMDB from the Go platform, which does not expose fork and exec separately
(due to the well-known pitfalls they have). We can't use Hallvard's suggestion
therefore.

4. me_fd can still be passed around by dup()ing the handle

If there are legitimate uses of passing the handle around, users can explicitly
dup the handle and then pass it on to child processes.

I hope my arguments are convincing enough to make the change before the next
release. The benefits are

* Consistent semantics between OS
* (In my view) saner defaults
* and a solution which works well across environments.

The downside is that we might break unknown client code. I can'c come up with a
use case of passing me_fd to children, so I suspect the risk is low.

The patch itself is trivial, and I'm happy to contribute it if desired.

Lorenz