[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: RE24 testing call #1 (OL 2.4.24)

This thread started on -devel, it does not belong on -technical.

Brett Maxfield wrote:
There is a discussion of solaris threads socket read/write locking issues and
some workarounds at: http://omniorb.sourceforge.net/omni40/omnithread.html

See "6 Threaded I/O shutdown for Unix"

None of this applies to OpenLDAP's code.


On 08/01/2011, at 9:55 AM, Howard Chu <hyc@symas.com <mailto:hyc@symas.com>>

Doug Leavitt wrote:

On 01/ 7/11 08:01 AM, Rein Tollevik wrote:
On 06.01.11 22.48, Quanah Gibson-Mount wrote:
--On Thursday, January 06, 2011 7:40 PM +0100 Rein Tollevik
<rein@OpenLDAP.org <mailto:rein@OpenLDAP.org>> wrote:

On 04.01.11 23.34, Quanah Gibson-Mount wrote:
Please test RE24 heavily.

test039 deadlocks for me on 64bit solaris10, both x86 and sparc :-( It
hangs in the monitor, triggered by the new swamp -SS option added to
slapd-tester. It works if run with -S or -SSS. It is the third server
that hangs, and it does so quite consistently with the same stack trace
every time. A gdb trace is at at:


Does this happen on both HEAD and RE24, or RE24 only?

Both, as well as when running the head tests suite with the 2.4.23
release. Looks as if the swamp additions have tripped into an
existing problem, not anything new. Leave it out of RE24 until if
have been resolved?

Btw, any other Solaris test runs out there? I´t like to know if it is
a real Solaris problem or just me..

I'm seeing a similar failure on 32 bit Sparc Solaris 10. But it actually
locks up in test036 for me, I never get as far as test039. The gdb trace
looks much the same as what you posted.

Looks like for some reason threads that are blocked waiting for their
sockets to become writable are never getting waken up. A regular SIGINT
shuts down slapd cleanly so it doesn't appear to be a problem with the
condvars being used to manage the threads. That kinda points to select()
simply not returning the writable status.

I haven't used this Solaris machine much, but in fact (looking at the
remnants of other files in my source tree on this box) this appears to have
been a problem since at least last August. (I.e., it looks like I was
investigating this same problem back then but dropped it and never got back
to it.)


I'm currently testing Solaris11 (Nevada) and not seeing any issues in
either 32 or 64
bit builds using both RE24 and HEAD. I have not had any failures on
x86 yet.
Testing is still underway for sparc and other internal system testing on
both platforms.

  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/