Issue 6104 - race condition with cancel operation
Summary: race condition with cancel operation
Status: RESOLVED PARTIAL
Alias: None
Product: OpenLDAP
Classification: Unclassified
Component: slapd (show other issues)
Version: unspecified
Hardware: All All
: --- normal
Target Milestone: ---
Assignee: OpenLDAP project
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-11 05:06 UTC by Hallvard Furuseth
Modified: 2020-03-19 15:29 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description Hallvard Furuseth 2009-05-11 05:06:20 UTC
Full_Name: Hallvard B Furuseth
Version: HEAD
OS: Linux
URL: 
Submission from: (NULL) (129.240.6.233)
Submitted by: hallvard


slapd/cancel.c sets o_abandon before o_cancel.  Thus it's possible for
the canceled operation to obey o_abandon before o_cancel gets set.
Though I had to insert some sleeps to achieve that.
Either the operation is abandoned and the Cancel operation receives
tooLate, or if the client unbinds/closes the connection fast enough
Cancel will hang:  slapd does not close the connection, and hangs on
shutdown: "slapd shutdown: waiting for 1 operations/tasks to finish".

Since the flags are not mutex-protected (at least not when read), it's
not enough to move the o_cancel setting after o_abandon in the Cancel
thread.  The cancelled thread might still see the o_abandon change
first.  A fix could be to make o_abandon a bitmask which says whether
the abandon is actually a cancel, but the Abandon and Cancel operations
will still need a mutex to coordinate so that Abandon does not reset
a Cancel bitflag.  In any case, it'd be cleaner if an operation which
reacts to o_abandon grabs some mutex before checking o_cancel.


The problem was tested as follows:
- sleep 0.2 sec after Statslog "DEL" and before setting SLAP_CANCEL_REQ.
- log "ABANDONED" when send_ldap_response() abandons the operation.
- Client: A python socket client which sends raw BER, no libldap:
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(('localhost', 3890))
    s.send(delete("cn=test"))
    time.sleep(0.1)
    s.send(cancel()) # cancel last operation
    #sys.exit()
    time.sleep(0.4)
    s.send(unbind())
-->
    conn=0 fd=9 ACCEPT from IP=127.0.0.1:56945 (IP=127.0.0.1:3890)
    conn=0 op=0 DEL dn="cn=test"
    conn=0 op=1 EXT oid=1.3.6.1.1.8
    conn=0 op=1 CANCEL msg=1
    conn=0 op=0 ABANDONED
    conn=0 op=2 UNBIND
    conn=0 op=1 RESULT oid= err=120 text=
    conn=0 fd=9 closed
    <server closed connection, client exited>
^C slapd

If the client exits after send(cancel()):
    conn=0 fd=9 ACCEPT from IP=127.0.0.1:48826 (IP=127.0.0.1:3890)
    conn=0 op=0 DEL dn="cn=test"
    conn=0 op=1 EXT oid=1.3.6.1.1.8
    conn=0 op=1 CANCEL msg=1
    conn=0 op=2 UNBIND
    conn=0 op=0 ABANDONED
    <not closing connection>
^C slapd
    daemon: shutdown requested and initiated.
    slapd shutdown: waiting for 1 operations/tasks to finish
    <slapd is hanging>
kill -KILL <slapd>

slapd.conf:
    include         servers/slapd/schema/core.schema
    allow           update_anon
    database        ldif
    directory       "."
    suffix          "cn=test"

Patches to slapd:

Index: cancel.c
--- cancel.c	21 Jan 2009 23:40:25 -0000	1.30
+++ cancel.c	11 May 2009 04:42:58 -0000
@@ -92,4 +92,8 @@
 		}
 
+		{
+			struct timeval timeout = { 0, 200000 };
+			select(0, NULL, NULL, NULL, &timeout);
+		}
 		o->o_cancel = SLAP_CANCEL_REQ;
 
Index: delete.c
--- delete.c	21 Jan 2009 23:40:26 -0000	1.144
+++ delete.c	11 May 2009 04:42:58 -0000
@@ -75,4 +75,8 @@
 		op->o_log_prefix, op->o_req_dn.bv_val, 0, 0, 0 );
 
+	{
+		struct timeval timeout = { 0, 200000 };
+		select(0, NULL, NULL, NULL, &timeout);
+	}
 	if( op->o_req_ndn.bv_len == 0 ) {
 		Debug( LDAP_DEBUG_ANY, "%s do_delete: root dse!\n",
Index: result.c
--- result.c	11 May 2009 02:23:51 -0000	1.331
+++ result.c	11 May 2009 04:42:58 -0000
@@ -418,4 +418,6 @@
 	if (( rs->sr_err == SLAPD_ABANDON || op->o_abandon ) && !op->o_cancel ) {
 		rc = SLAPD_ABANDON;
+		Statslog( LDAP_DEBUG_STATS,
+			"%s ABANDONED\n", op->o_log_prefix, 0, 0, 0, 0 );
 		goto clean2;
 	}
Comment 1 Howard Chu 2009-05-11 05:44:33 UTC
h.b.furuseth@usit.uio.no wrote:
> Full_Name: Hallvard B Furuseth
> Version: HEAD
> OS: Linux
> URL:
> Submission from: (NULL) (129.240.6.233)
> Submitted by: hallvard
>
>
> slapd/cancel.c sets o_abandon before o_cancel.  Thus it's possible for
> the canceled operation to obey o_abandon before o_cancel gets set.

Cancel is just best-effort, why not just set o_cancel before o_abandon?

> Though I had to insert some sleeps to achieve that.
> Either the operation is abandoned and the Cancel operation receives
> tooLate, or if the client unbinds/closes the connection fast enough
> Cancel will hang:  slapd does not close the connection, and hangs on
> shutdown: "slapd shutdown: waiting for 1 operations/tasks to finish".
>
> Since the flags are not mutex-protected (at least not when read), it's
> not enough to move the o_cancel setting after o_abandon in the Cancel
> thread.  The cancelled thread might still see the o_abandon change
> first.  A fix could be to make o_abandon a bitmask which says whether
> the abandon is actually a cancel, but the Abandon and Cancel operations
> will still need a mutex to coordinate so that Abandon does not reset
> a Cancel bitflag.  In any case, it'd be cleaner if an operation which
> reacts to o_abandon grabs some mutex before checking o_cancel.
>
>
> The problem was tested as follows:
> - sleep 0.2 sec after Statslog "DEL" and before setting SLAP_CANCEL_REQ.
> - log "ABANDONED" when send_ldap_response() abandons the operation.
> - Client: A python socket client which sends raw BER, no libldap:
>      s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>      s.connect(('localhost', 3890))
>      s.send(delete("cn=test"))
>      time.sleep(0.1)
>      s.send(cancel()) # cancel last operation
>      #sys.exit()
>      time.sleep(0.4)
>      s.send(unbind())
> -->
>      conn=0 fd=9 ACCEPT from IP=127.0.0.1:56945 (IP=127.0.0.1:3890)
>      conn=0 op=0 DEL dn="cn=test"
>      conn=0 op=1 EXT oid=1.3.6.1.1.8
>      conn=0 op=1 CANCEL msg=1
>      conn=0 op=0 ABANDONED
>      conn=0 op=2 UNBIND
>      conn=0 op=1 RESULT oid= err=120 text=
>      conn=0 fd=9 closed
>      <server closed connection, client exited>
> ^C slapd
>
> If the client exits after send(cancel()):
>      conn=0 fd=9 ACCEPT from IP=127.0.0.1:48826 (IP=127.0.0.1:3890)
>      conn=0 op=0 DEL dn="cn=test"
>      conn=0 op=1 EXT oid=1.3.6.1.1.8
>      conn=0 op=1 CANCEL msg=1
>      conn=0 op=2 UNBIND
>      conn=0 op=0 ABANDONED
>      <not closing connection>
> ^C slapd
>      daemon: shutdown requested and initiated.
>      slapd shutdown: waiting for 1 operations/tasks to finish
>      <slapd is hanging>
> kill -KILL<slapd>
>
> slapd.conf:
>      include         servers/slapd/schema/core.schema
>      allow           update_anon
>      database        ldif
>      directory       "."
>      suffix          "cn=test"
>
> Patches to slapd:
>
> Index: cancel.c
> --- cancel.c	21 Jan 2009 23:40:25 -0000	1.30
> +++ cancel.c	11 May 2009 04:42:58 -0000
> @@ -92,4 +92,8 @@
>   		}
>
> +		{
> +			struct timeval timeout = { 0, 200000 };
> +			select(0, NULL, NULL, NULL,&timeout);
> +		}
>   		o->o_cancel = SLAP_CANCEL_REQ;
>
> Index: delete.c
> --- delete.c	21 Jan 2009 23:40:26 -0000	1.144
> +++ delete.c	11 May 2009 04:42:58 -0000
> @@ -75,4 +75,8 @@
>   		op->o_log_prefix, op->o_req_dn.bv_val, 0, 0, 0 );
>
> +	{
> +		struct timeval timeout = { 0, 200000 };
> +		select(0, NULL, NULL, NULL,&timeout);
> +	}
>   	if( op->o_req_ndn.bv_len == 0 ) {
>   		Debug( LDAP_DEBUG_ANY, "%s do_delete: root dse!\n",
> Index: result.c
> --- result.c	11 May 2009 02:23:51 -0000	1.331
> +++ result.c	11 May 2009 04:42:58 -0000
> @@ -418,4 +418,6 @@
>   	if (( rs->sr_err == SLAPD_ABANDON || op->o_abandon )&&  !op->o_cancel ) {
>   		rc = SLAPD_ABANDON;
> +		Statslog( LDAP_DEBUG_STATS,
> +			"%s ABANDONED\n", op->o_log_prefix, 0, 0, 0, 0 );
>   		goto clean2;
>   	}
>
>


-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 2 Hallvard Furuseth 2009-05-11 13:57:21 UTC
hyc@symas.com writes:
>> slapd/cancel.c sets o_abandon before o_cancel.  Thus it's possible for
>> the canceled operation to obey o_abandon before o_cancel gets set.
> 
> Cancel is just best-effort, why not just set o_cancel before o_abandon?

Memory coherence.  Without mutex protection, the cancelled thread might
still see the o_abandon change before the o_cancel change.  And thus not
send a response, nor update info for the Cancel operation to react to.

-- 
Hallvard

Comment 3 Howard Chu 2009-05-11 22:14:17 UTC
Hallvard B Furuseth wrote:
> hyc@symas.com writes:
>>> slapd/cancel.c sets o_abandon before o_cancel.  Thus it's possible for
>>> the canceled operation to obey o_abandon before o_cancel gets set.
>>
>> Cancel is just best-effort, why not just set o_cancel before o_abandon?
>
> Memory coherence.  Without mutex protection, the cancelled thread might
> still see the o_abandon change before the o_cancel change.  And thus not
> send a response, nor update info for the Cancel operation to react to.
>
I'd prefer to condense it all into a bitfield. We already ensure that the 
c_mutex is held when setting these flags. It would be a lot of overhead to 
grab the mutex just to read the flag, and would require additional analysis to 
make sure the new locking behavior doesn't introduce new deadlocks.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 4 Hallvard Furuseth 2009-05-15 15:21:05 UTC
Howard Chu writes:
> I'd prefer to condense it all into a bitfield.

Right... keep both o_abandon and o_cancel (maybe renamed to o_dummy) for
RE24 though, otherwise we have a silent binary incompatible change with
already-compiled slapd modules.  The change will in any case break
modules that play with o_cancel though.

> We already ensure that
> the c_mutex is held when setting these flags.

Not yet, but c_mutex it is:-)  Must protect the following code.
Or rather, bitflag code which will replace it:

- o_cancel access in cancel.c, connection.c:connection_operation(),
  back-relay/op.c.

- o_abandon setting in overlays/syncprov.c:syncprov_op_abandon(), and
  in overlays/retcode.c where hopefully the first case should have been
	if ( op2.o_abandon ) op->o_abandon = 1;
  so that we can never reset o_abandon once it is set.

> It would be a lot of
> overhead to grab the mutex just to read the flag,
> and would require additional analysis to make sure the new locking
> behavior doesn't introduce new deadlocks.

Checking o_abandon!=0 works unmutexed.  (Well, technically any
such unmutexed access is wrong, but in practice we should be OK.)
As long as we:
- never reset o_abandon like retcode above can do, and
- never unmutexed assume consistency betweeen abandon-related variables
  set by different threads.

So the equivalent of this should be OK:
  if ( op->o_abandon ) rs->sr_err = SLAPD_ABANDON;
  ...  
  if ( rs->sr_err == SLAPD_ABANDON ) {
     lock c_mutex;
     more careful o_abandon checks;
     unlock c_mutex;
  }

  
While I'm looking at this:

The ldap_pvt_thread_yield() in cancel.c is really ugly.  There is no
cond to attach to it to without breaking binary compatibility, but even
a even a global mutex+cond might be better, with threads broadcasting on
it when it might be relevant.

It's still ugly though.  Cancelling an expensive operation ought to be a
friendly operation, but instead it is unfriendly in that yet another
thread is occupied, solely to wait for that operation.  What would make
sense (for RE25) would be to redo Cancel so that the cancelled thread is
responsible for sending the Cancel response or resubmitting an operation
which will.

Also, the be_cancel implementation in at least back-relay looks wrong.
Cancel isn't dispatched on DN, and back-relay with no 'relay' directive
configured cannot know where to dispatch.  I haven't checked how it
works, but it seems a call to be_cancel means the Cancel operation
_might_ belong to this backend.  But it might not, which means
back-relay shouldn't set op->o_cancel in this case.  What it should do
I don't know though.  And I don't know if syncprov is doing any better.

-- 
Hallvard

Comment 5 Hallvard Furuseth 2009-05-21 21:45:54 UTC
changed notes
moved from Incoming to Software Bugs
Comment 6 Hallvard Furuseth 2009-05-22 22:08:38 UTC
Well, this turned out a lot of problems.  Currently blocked by 
ITS#6138 Bad Cancel/Abandon/"internal abandon"/Syncprov interactions.

back-relay/op.c done (ITS#6133). syncprov/retcode issues may fit better
under ITS#6138.

Bitflags should be unnecessary as such, just need a set of values.
Though it wouldn't hurt if they were chosen so one could use bit
operations to reduce checks for "o_abandon == A or B".  "Suppress
response" value mentioned in ITS#6138 also needed, maybe that must be a
bitflag when combined with Cancel stuff.

Some tidbits uglifying a fix:
* op->o_conn->c_mutex is locked when be->be_abandon() is called,
  but not when be->be_cancel() is called.
* send_ldap_ber() needs the lock while it holds &conn->c_write1_mutex,
  but code elsewhere indicates the reverse lock order.

Question:
  syncprov_op_abandon() holds &si->si_ops_mutex when setting
  so->s_op->o_abandon.  Does it need that?  The function needs to
  grab op->o_conn->c_mutex when called as Cancel, but must not do
  that while holding &si->si_ops_mutex since Abandon grabs the
  locks in the opposite order.

-- 
Hallvard

Comment 7 Howard Chu 2009-05-22 22:13:13 UTC
h.b.furuseth@usit.uio.no wrote:
> Well, this turned out a lot of problems.  Currently blocked by
> ITS#6138 Bad Cancel/Abandon/"internal abandon"/Syncprov interactions.
>
> back-relay/op.c done (ITS#6133). syncprov/retcode issues may fit better
> under ITS#6138.
>
> Bitflags should be unnecessary as such, just need a set of values.
> Though it wouldn't hurt if they were chosen so one could use bit
> operations to reduce checks for "o_abandon == A or B".  "Suppress
> response" value mentioned in ITS#6138 also needed, maybe that must be a
> bitflag when combined with Cancel stuff.

Yes, it still seems some bitflags will be more appropriate.

> Some tidbits uglifying a fix:
> * op->o_conn->c_mutex is locked when be->be_abandon() is called,
>    but not when be->be_cancel() is called.
> * send_ldap_ber() needs the lock while it holds&conn->c_write1_mutex,
>    but code elsewhere indicates the reverse lock order.
>
> Question:
>    syncprov_op_abandon() holds&si->si_ops_mutex when setting
>    so->s_op->o_abandon.  Does it need that?  The function needs to
>    grab op->o_conn->c_mutex when called as Cancel, but must not do
>    that while holding&si->si_ops_mutex since Abandon grabs the
>    locks in the opposite order.

syncprov_op_abandon() can remove ops from the list, so yes, it must hold 
si_ops_mutex.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Comment 8 Hallvard Furuseth 2009-05-23 18:13:40 UTC
hyc@symas.com writes:
>> Bitflags should be unnecessary as such, just need a set of values.
>> Though it wouldn't hurt if they were chosen so one could use bit
>> operations to reduce checks for "o_abandon == A or B".  "Suppress
>> response" value mentioned in ITS#6138 also needed, maybe that must be a
>> bitflag when combined with Cancel stuff.
> 
> Yes, it still seems some bitflags will be more appropriate.

As long as we decide which flag combinations are (not) valid.  Problem
with "unrestricted" bitflags compared to a list of values is that they
allow and thus require support for flag combinations we have no real
need to support.

>> Question:
>>    syncprov_op_abandon() holds&si->si_ops_mutex when setting
>>    so->s_op->o_abandon.  Does it need that?  The function needs to
>>    grab op->o_conn->c_mutex when called as Cancel, but must not do
>>    that while holding&si->si_ops_mutex since Abandon grabs the
>>    locks in the opposite order.
> 
> syncprov_op_abandon() can remove ops from the list, so yes, it must
> hold si_ops_mutex.

Right, but does it need si_ops_mutex while setting o_abandon?  Cancel
could remove the op, unlock si_ops_mutex, lock c_mutex, update
o_abandon.  Alternative: Lock c_mutex first (that must be safe since
Abandon is doing it anyway), then lock si_ops_mutex.  So it's not
a critical difference, just that a lock must be held a bit longer.

-- 
Hallvard

Comment 9 Hallvard Furuseth 2009-06-03 14:59:40 UTC
changed notes
Comment 10 Quanah Gibson-Mount 2009-06-04 22:21:17 UTC
changed notes
changed state Open to Partial
Comment 11 Hallvard Furuseth 2009-06-05 21:53:07 UTC
Hrmph, the placement of the 'so->s_op->o_abandon = 1;' which I wanted
to move out of the mutex lock derives from syncprov.c rev 1.20 commented
"Fix abandon sequencing":-(

-- 
Hallvard

Comment 12 OpenLDAP project 2014-08-01 21:04:22 UTC
syncprov (msg#7), then see ITS#6138.
Some fixes in HEAD.
Some fixes in RE24
Comment 13 Ddh3eaa 2014-12-29 10:11:15 UTC

Sent from my iPhone

Comment 14 michael garofalo 2017-01-18 02:57:05 UTC

Comment 15 Ondřej Kuzník 2020-01-03 10:49:28 UTC
This might be related or the same issue.

With ITS#9124 and ITS#9145 fixed, if you run:

ldapsearch -E '!sync=rp' -E '!cancel'

syncprov has freed the operation and cancel_extop() still tries to touch
the same Operation pointer without checking it's still in the connection
list. At least valgrind isn't happy.

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP