[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3546) Sync rep provider and server crash on SIGTERM



hyc@symas.com wrote:

>The backtrace you provided was a bit inaccurate; you need to compile 
>with "-g" (debugging info present) and without optimization in order to 
>get a consistent trace.
>
>I've reproduced part of the problem; the provider is not segfaulting, it 
>is hitting an assert() at connection.c:687. Specifically, the connection 
>is being torn down while someone is still waiting to write on it. This 
>happens because there is a large search in progress, and data has piled 
>up faster than the network can send it. When you terminate the syncrepl 
>client, it sends an Unbind request and then closes its side of the 
>connection. (In my test, the syncrepl consumer shutdown gracefully 
>though, there was no crash.) The Unbind is received by the provider but 
>actually gets Deferred, because it's still waiting for its writes to 
>flush. Then the connection actually closes, and the problem occurs. This 
>provider-side assert() situation is not unique to syncrepl, it can 
>happen whenever any large search request is terminated in the middle. 
>  
>
I take this back. You're right, it only happens for refreshAndPersist.

>We'll definitely have to fix that up.
>
>I'll play with this a bit more to see if I can reproduce the 
>consumer-side crash.
>
I've reproduced this crash as well. Again, another assert(). Thanks for 
the bug report.

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.       Director, Highland Sun
  http://www.symas.com               http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support