[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: RE_23 hangs in test039 (back-hdb)

--On Monday, January 09, 2006 7:54 AM -0800 Howard Chu <hyc@symas.com> wrote:

Pierangelo Masarati wrote:
Pierangelo Masarati wrote:

I'm experimenting with a fix now. The problem may be that the same conn
can be fetched by ldap_back_getconn() multiple times while a bind is in
progress (since in the initial state, both the incoming session and the
conn handle are anonymous).

Howard, many thanks for spotting this issue, it was really becoming a
nightmare.  I've ported your fix to back-meta and added some bind
testing to concurrency testers (test007, test035 and test039).  They
seem to run fine, but I seldom see any hang, so I really need to repeat
the tests a lot of times.  If anyone who is often seeing hangs could
step in and stress HEAD a bit... (tests035 and test039).  If it works,
we can likely close ITS#4246, 4247 and 3832.

Apart from fixing few small issues in the tester programs, I've repeatedly run test036 & test039 on a few different architectures, including dual CPUs, without any of the relevant issues (neither hangs nor loops). I wouldn't consider my tests exaustive though, just my own 2c.

I ran test039 repeatedly for a couple of hours with no hangs on my
dual-core x86_64 Linux system. I'll try looping the other tests as well.

I've been unable to reproduce the hang on my 4 CPU Solaris 8 system.


-- Quanah Gibson-Mount Principal Software Developer ITSS/Shared Services Stanford University GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html