[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#8844) LMDB mdb_env_close is unsafe in forked child



On 14/06/2018 20:34, Howard Chu wrote:
> Hallvard Breien Furuseth wrote:
>> Wait, what?  The child didn't write any pids to the table.  If
>> there are pids for it to clear, something is wrong. That can be:
> 
> No, there most likely aren't PIDs for it to clear. But most importantly, 
> it must not clear using env->me_pid since that is still the parent's PID.
> 
>> - The pid is from an older run, and the system reused the pid.  If we
>>    code for this, it should be to clear such pids during mdb_env_open.
>>    It gets very arbitrary to clear them when the user does env_close()
>>    in the child and the child happens to have the that reused pid.
> 
> No. If the system reused the PID then that means the previous owner of 
> that PID is dead already anyway.

Right...

> If there are any table entries under 
> that PID there's no problem removing them.

The problem is that it's arbitrary.  Something is exiting without
clearing its pid slots, and 0.001% of the time this code will catch it.

Better to skip this cleanup, or to always do it - but "always" belongs
in env_open().   If a new env is seeing its pid in use, then it can 
safely clean it out. Last we talked about that, I liked it and you
did not:-)

>> - The child opened a new MDB_env with the same path, and closes the
>>    parent's env afterwards.  Don't Do That, users.  But what to do in
>>    this situation, if we are to have code for it, is to do nothing with
>>    the lockfile.  Not close it either, since closing it will lose the
>>    lock for the env which was opened in the child.
> 
> That's a completely different case. This ITS is specifically about a 
> process with an open environment that subsequently forks.

No, I mean:

Parent, pid P:
  open env A.

Child, pid Q:
- open and start using env B with the same path.
- close env A, while B is still open.
   With the new code, this clears pid Q - i.e. B's pids.
   But lmdb.h already lists that as a "don't do that",
   because it also removes B's lock on the lockfile.