[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: slapd-meta doesn't continue with multiple uri's

On 23/08/2012 11:18, Howard Chu wrote:

Your description of your procedure is so vague and imprecise it's difficult
for anybody to decipher what you're talking about.
Reading back thru the several posts in this thread, what I see you saying is
that you have tested a few different configurations:

1) target host is up, target LDAP server is down
     this should fail immediately because the host OS will immediately send a
TCP Connection Refused response

2) target host is initially down
     this will not fail until the first TCP connect request times out

3) target host is initially up and connected, but thru your iptables
manipulation you sever the link
     this will not fail until the TCP connection times out, which it won't
unless you're using TCP Keepalives, and by default those are only sent once
every 2 hours.

Let me make it less vague then.

What I've been trying to simulate are the various modes by which a uri target will become unavailable. What I'm trying to achieve is to have the meta backend point to four domain controllers and cope with one or more DCs being unavailable.

Having gone through this and let the system time out each time, I've found it does fail over under one of the conditions listed below, but it takes about 15 minutes to do so.


1. slapd starts, first target is unreachable;

2. slapd starts, first target is reachable but has no service running;

3. slapd already running, first target up and connected then later becomes unreachable.


a. 'Unreachable' simulated by blocking outbound access with the following iptables rule:

   iptables -A OUTPUT -d host1 -j DROP

b. 'Unreachable' simulated making the first target a host that is up but with no service running.

Results (all with 2.4.32):

Case 1a: slapd retries host1 continuously and times out after about 180s. No attempt is made to contact additional targets.

Case 2b: slapd retries host1 continuously and times out after about 180s. No attempt is made to contact additional targets.

Case 3a: slapd retries host1 continuously, doubling an internal timeout value each time, eventually timing out after 19 retries and about 15m. It does then fall through to host2 and subsequent connections don't attempt to contact host1.

Here's my config. I've also tried setting nretries explicitly to 3, but it makes no difference.

database            meta
suffix              dc=local
rootdn              cn=administrator,dc=local
rootpw              secret

network-timeout     1

uri                 ldap://host1:3268/ou=dc1,dc=local

suffixmassage       "ou=dc1,dc=local" "dc=example,dc=com"

idassert-bind       bindmethod=simple

idassert-authzfrom  "dn.exact:cn=administrator,dc=local"

These results suggest to me that network-timeout and nretries (which should default to 3) don't work as documented.

Having said that, it does seem to at least cope with scenario 3, albeit with a long timeout.

Ideally it'd work in all cases. Pierangelo says the failover works when connect() times out, but I'd have thought that would include scenarios 1 and 2 but not 3.

Liam Gretton                                    liam.gretton@le.ac.uk
HPC Architect                                 http://www.le.ac.uk/its
IT Services                                   Tel: +44 (0)116 2522254
University of Leicester, University Road
Leicestershire LE1 7RH, United Kingdom