[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: ITS#3985 test039 hangs on Windows



hyc@symas.com wrote:
> OK, with current HEAD, it no longer appears to totally hang, it just 
> runs very very slowly. It takes about 1 hour 25 minutes on my 2GHz 
> PentiumM laptop. (For bdb. And that much again for hdb. I didn't bother 
> with ldbm this time around.)
>
> I'm also seeing a lot of "syslog" writes, even though the servers should 
> be running with -s0. I guess that should be a separate issue...
>   
I'm going to close this and worry about the remaining deadlock issues in 
ITS#3832. For the overall slow execution rate of this test, the short 
answer is "Windows is like that." The longer answer is that by default, 
Windows always retries a Connection Open attempt, and it uses an 
exponential backoff on the retries. So even though it receives an 
immediate TCP RST when it tries to chase the referral to port 9016, it 
sits and waits and tries again (and again). You can alter this behavior 
by editing the registry. See
http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/networking/tcpip03.mspx

The relevant item is TcpMaxConnectRetransmissions, the default is 2 
retries. I've set mine to 0, you have to reboot for it to take effect. 
Setting this to zero allows the client to get the Connection Refused 
result immediately instead of blocking and waiting.

Sigh. A transport layer should only retransmit in the absence of 
communication, i.e., a timeout. When the remote server tells you 
definitively "nobody's listening, go away" the transport layer should 
report that to the application layer immediately. Only the user or app 
should decide if they really want to try bothering the remote server 
again. Yet another example of Microsoft using good technology without 
really understanding it...

-- 
  -- Howard Chu
  Chief Architect, Symas Corp.  http://www.symas.com
  Director, Highland Sun        http://highlandsun.com/hyc
  OpenLDAP Core Team            http://www.openldap.org/project/