Issue 8844 - LMDB mdb_env_close is unsafe in forked child
Summary: LMDB mdb_env_close is unsafe in forked child
Status: VERIFIED FIXED
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-02 16:05 UTC by Howard Chu
Modified: 2018-12-19 17:18 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Howard Chu 2018-05-02 16:05:05 UTC
Full_Name: Howard Chu
Version: 2.4
OS: Linux
URL: ftp://ftp.openldap.org/incoming/
Submission from: (NULL) (83.136.45.253)
Submitted by: hyc


Following on from ITS#8505.
mdb_env_close0() uses env->me_pid when clearing the readers table. If the env
was open in a process that forked, and the child process calls mdb_env_close(),
it will be clearing the wrong PID. Change this to use getpid() instead.
Comment 1 Hallvard Furuseth 2018-05-02 18:50:12 UTC
On 02/05/2018 18:05, hyc@openldap.org wrote:
> Following on from ITS#8505.
> mdb_env_close0() uses env->me_pid when clearing the readers table. If the env
> was open in a process that forked, and the child process calls mdb_env_close(),
> it will be clearing the wrong PID. Change this to use getpid() instead.

Wait, what?  The child didn't write any pids to the table.  If
there are pids for it to clear, something is wrong. That can be:

- The pid is from an older run, and the system reused the pid.  If we
   code for this, it should be to clear such pids during mdb_env_open.
   It gets very arbitrary to clear them when the user does env_close()
   in the child and the child happens to have the that reused pid.

- The child opened a new MDB_env with the same path, and closes the
   parent's env afterwards.  Don't Do That, users.  But what to do in
   this situation, if we are to have code for it, is to do nothing with
   the lockfile.  Not close it either, since closing it will lose the
   lock for the env which was opened in the child.

Comment 2 Howard Chu 2018-06-14 18:34:49 UTC
Hallvard Breien Furuseth wrote:
> On 02/05/2018 18:05, hyc@openldap.org wrote:
>> Following on from ITS#8505.
>> mdb_env_close0() uses env->me_pid when clearing the readers table. If the env
>> was open in a process that forked, and the child process calls mdb_env_close(),
>> it will be clearing the wrong PID. Change this to use getpid() instead.
> 
> Wait, what?  The child didn't write any pids to the table.  If
> there are pids for it to clear, something is wrong. That can be:

No, there most likely aren't PIDs for it to clear. But most importantly, it 
must not clear using env->me_pid since that is still the parent's PID.

> - The pid is from an older run, and the system reused the pid.  If we
>    code for this, it should be to clear such pids during mdb_env_open.
>    It gets very arbitrary to clear them when the user does env_close()
>    in the child and the child happens to have the that reused pid.

No. If the system reused the PID then that means the previous owner of that 
PID is dead already anyway. If there are any table entries under that PID 
there's no problem removing them.

> - The child opened a new MDB_env with the same path, and closes the
>    parent's env afterwards.  Don't Do That, users.  But what to do in
>    this situation, if we are to have code for it, is to do nothing with
>    the lockfile.  Not close it either, since closing it will lose the
>    lock for the env which was opened in the child.

That's a completely different case. This ITS is specifically about a process 
with an open environment that subsequently forks.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 3 Hallvard Furuseth 2018-06-14 18:53:08 UTC
On 14/06/2018 20:34, Howard Chu wrote:
> Hallvard Breien Furuseth wrote:
>> Wait, what?  The child didn't write any pids to the table.  If
>> there are pids for it to clear, something is wrong. That can be:
> 
> No, there most likely aren't PIDs for it to clear. But most importantly, 
> it must not clear using env->me_pid since that is still the parent's PID.
> 
>> - The pid is from an older run, and the system reused the pid.  If we
>>    code for this, it should be to clear such pids during mdb_env_open.
>>    It gets very arbitrary to clear them when the user does env_close()
>>    in the child and the child happens to have the that reused pid.
> 
> No. If the system reused the PID then that means the previous owner of 
> that PID is dead already anyway.

Right...

> If there are any table entries under 
> that PID there's no problem removing them.

The problem is that it's arbitrary.  Something is exiting without
clearing its pid slots, and 0.001% of the time this code will catch it.

Better to skip this cleanup, or to always do it - but "always" belongs
in env_open().   If a new env is seeing its pid in use, then it can 
safely clean it out. Last we talked about that, I liked it and you
did not:-)

>> - The child opened a new MDB_env with the same path, and closes the
>>    parent's env afterwards.  Don't Do That, users.  But what to do in
>>    this situation, if we are to have code for it, is to do nothing with
>>    the lockfile.  Not close it either, since closing it will lose the
>>    lock for the env which was opened in the child.
> 
> That's a completely different case. This ITS is specifically about a 
> process with an open environment that subsequently forks.

No, I mean:

Parent, pid P:
  open env A.

Child, pid Q:
- open and start using env B with the same path.
- close env A, while B is still open.
   With the new code, this clears pid Q - i.e. B's pids.
   But lmdb.h already lists that as a "don't do that",
   because it also removes B's lock on the lockfile.

Comment 4 Howard Chu 2018-06-14 19:04:20 UTC
Hallvard Breien Furuseth wrote:
> On 14/06/2018 20:34, Howard Chu wrote:
>> Hallvard Breien Furuseth wrote:
>>> Wait, what?  The child didn't write any pids to the table.  If
>>> there are pids for it to clear, something is wrong. That can be:
>>
>> No, there most likely aren't PIDs for it to clear. But most importantly, it 
>> must not clear using env->me_pid since that is still the parent's PID.
>>
>>> - The pid is from an older run, and the system reused the pid.  If we
>>>    code for this, it should be to clear such pids during mdb_env_open.
>>>    It gets very arbitrary to clear them when the user does env_close()
>>>    in the child and the child happens to have the that reused pid.
>>
>> No. If the system reused the PID then that means the previous owner of that 
>> PID is dead already anyway.
> 
> Right...
> 
>> If there are any table entries under that PID there's no problem removing them.
> 
> The problem is that it's arbitrary.  Something is exiting without
> clearing its pid slots, and 0.001% of the time this code will catch it.

Sure. But I don't think it's worthwhile to make a special case here to skip 
this step.

> Better to skip this cleanup, or to always do it - but "always" belongs
> in env_open().   If a new env is seeing its pid in use, then it can safely 
> clean it out. Last we talked about that, I liked it and you
> did not:-)

And I don't see any particular reason to add this step.

>>> - The child opened a new MDB_env with the same path, and closes the
>>>    parent's env afterwards.  Don't Do That, users.  But what to do in
>>>    this situation, if we are to have code for it, is to do nothing with
>>>    the lockfile.  Not close it either, since closing it will lose the
>>>    lock for the env which was opened in the child.
>>
>> That's a completely different case. This ITS is specifically about a process 
>> with an open environment that subsequently forks.
> 
> No, I mean:
> 
> Parent, pid P:
>   open env A.
> 
> Child, pid Q:
> - open and start using env B with the same path.
> - close env A, while B is still open.
>    With the new code, this clears pid Q - i.e. B's pids.
>    But lmdb.h already lists that as a "don't do that",
>    because it also removes B's lock on the lockfile.
> 
Yeah, this is still a "don't do that" though, regardless of this particular ITS.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 5 Hallvard Furuseth 2018-06-14 19:36:33 UTC
On 14/06/2018 21:04, Howard Chu wrote:
> Hallvard Breien Furuseth wrote:
>> On 14/06/2018 20:34, Howard Chu wrote:
>>> If there are any table entries under that PID there's no problem 
>>> removing them.
>>
>> The problem is that it's arbitrary.  Something is exiting without
>> clearing its pid slots, and 0.001% of the time this code will catch it.
> 
> Sure. But I don't think it's worthwhile to make a special case here to 
> skip this step.

To my eyes the getpid() is a special case already, to someone trying to
understand the code. Though adding a comment could fix that. See below.

>> Better to skip this cleanup, or to always do it - but "always" belongs
>> in env_open().   If a new env is seeing its pid in use, then it can 
>> safely clean it out. Last we talked about that, I liked it and you
>> did not:-)
> 
> And I don't see any particular reason to add this step.

right

>>(...)
>> Parent, pid P:
>>   open env A.
>>
>> Child, pid Q:
>> - open and start using env B with the same path.
>> - close env A, while B is still open.
>>    With the new code, this clears pid Q - i.e. B's pids.
>>    But lmdb.h already lists that as a "don't do that",
>>    because it also removes B's lock on the lockfile.
>>
> Yeah, this is still a "don't do that" though, regardless of this 
> particular ITS.

Right.  So in summary, there are two cases when getpid() can be there
to clean out, and in neither case does it make sense to clean it out.

Other than the sense of avoiding one "if () { init two vars; }".
I like that better than a comment about not bothering to, but a comment
is OK too.

Comment 6 OpenLDAP project 2018-12-19 17:18:59 UTC
Fixed in mdb.master
Fixed in mdb.re09 (0.9.23)
Comment 7 Quanah Gibson-Mount 2018-12-19 17:18:59 UTC
changed notes
changed state Open to Closed
moved from Incoming to Software Bugs