[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd-read hangs (ITS#3832)



> Aaron Richton wrote:
>>> A hanging read wouldn't cause a modification to fail. Look more closely
>>> at the output from the failed run and you'll probably see one of the
>>> Modify tasks failed with (52) Server Unavailable. I've been getting
>>> this
>>> occasionally as well on 036 and 039.
>>>
>>
>> Sure enough, that's it. I've saved up the testrun directory if anybody
>> cares to take a look and can't repro on their platform.
> I've traced a bit of what's going on in my testrun slapd.?.log files.
> Basically the META_BIND_TIMEOUT is too fast. If a Bind doesn't succeed
> in one or two tries, it gives up. The reason for the slow Binds is that
> slapd.3 receives many Bind requests (one on each of several inbound
> connections) and funnels them all through a pooled connection to
> slapd.1. But slapd only allows maxthreads/2 active operations for any
> given connection, so slapd.1 defers a lot of the incoming operations.
> The test can be "fixed" by doubling the threads setting in slapd.1.conf.
> Or we can change the slapd-* testers to retry when they get an
> LDAP_UNAVAILABLE result.

I've set the values after tuning the test on several architectures, but
apparently the default cannot be always good.  I'm working at making the
bind timeout configurable, so real deployments can be fine-tuned if
required.  For the test, we can use "safe" defaults, e.g. very long
timeouts, more threads, "nretries forever" and so.  I wouldn't modify the
testers since an error condition of that type should not occur; in this
case, rather than a bug in the software it indicates a poorly designed
test (it's my fault, sigh).

p.

-- 
Pierangelo Masarati
mailto:pierangelo.masarati@sys-net.it


    SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497