[Date Prev][Date Next] [Chronological] [Thread] [Top]
Re:Slapd-meta stop at the first unreachable candidate

To: openldap-technical openldap org <openldap-technical@openldap.org>
Subject: Re:Slapd-meta stop at the first unreachable candidate
From: Michel Gruau <michel.gruau@laposte.net>
Date: Mon, 05 Sep 2011 16:31:14 +0200
Below is a sample configuration allowing to reproduce the problem : 

Three openldap data instances configured as follows: 
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/core.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/cosine.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/inetorgperson.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/nis.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/dyngroup.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/misc.schema
pidfileÂÂÂÂÂÂÂÂ /opt/openldap/var/run/server1.pid
argsfileÂÂÂÂÂÂÂ /opt/openldap/var/run/server1.args
loglevelÂÂÂÂÂÂÂ stats
databaseÂÂÂÂÂÂÂ bdb
suffixÂÂÂÂÂÂÂÂÂ ou=orgunit,o=gouv,c=fr
directoryÂÂÂÂÂÂ /opt/openldap/var/server1

Note: server1 is changed by server2 and server3 for other instances.

Each instance contains the following data: (only 4 entries):
dn: ou=orgunit,o=gouv,c=fr
objectClass: top
objectClass: organizationalUnit
ou: orgunit

dn: ou=dept1,ou=orgunit,o=gouv,c=fr
ou: dept1
objectClass: top
objectClass: organizationalUnit

dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: user11@server1.com
cn: User 11
uid: user11
givenName: User
sn: 11

dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: inetOrgPerson
mail: user12@server1.com
cn: User 12
uid: user12
givenName: User
sn: 12

Note: user1x and dept1 are substituted in instances 2 and 3 by user2x, dept2, user3x,dept3.

Data instances are launched using this command:
/opt/openldap/libexec/slapd -n server1 -f /opt/openldap/etc/openldap/server1.conf -h ldap://0.0.0.0:1001/
/opt/openldap/libexec/slapd -n server2 -f /opt/openldap/etc/openldap/server2.conf -h ldap://0.0.0.0:1002/
/opt/openldap/libexec/slapd -n server3 -f /opt/openldap/etc/openldap/server3.conf -h ldap://0.0.0.0:1003/

Meta instance is configured as follows: 
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/core.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/cosine.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/inetorgperson.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/nis.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/dyngroup.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/anais.schema
includeÂÂÂÂÂÂÂÂ /opt/openldap/etc/openldap/schema/misc.schema
pidfileÂÂÂÂÂÂÂÂ /opt/openldap/var/run/meta.pid
argsfileÂÂÂÂÂÂÂ /opt/openldap/var/run/meta.args
databaseÂÂÂÂÂÂÂ meta
suffixÂÂÂÂÂÂÂÂÂ ou=orgunit,o=gouv,c=fr
uri ldap://localhost:1001/ou=dept1,ou=orgunit,o=gouv,c=fr
#network-timeout 5
#timeout 3
uri ldap://localhost:1002/ou=dept2,ou=orgunit,o=gouv,c=fr
#network-timeout 5
#timeout 4
uri ldap://localhost:1003/ou=dept3,ou=orgunit,o=gouv,c=fr
#network-timeout 5
#timeout 4

and it is launched as follows:
/opt/openldap/libexec/slapd -n meta -f /opt/openldap/etc/openldap/meta.conf -h ldap://0.0.0.0:1000/ -d 256

# test with the 3 servers up
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr
dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr
=> entries from the three servers are returned

# stop server 2 (kill -INT ...) and perfom a new search:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> looks good : entries from server1 and server 3 are returned

Below are the meta instance logs:
conn=1001 fd=9 ACCEPT from IP=172.30.8.13:55048 (IP=0.0.0.0:1000)
conn=1001 op=0 BIND dn="" method=128
conn=1001 op=0 RESULT tag=97 err=0 text=
conn=1001 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 filter="(objectClass=person)"
conn=1001 op=1 SRCH attr=dn
conn=1001 op=1 meta_back_retry[1]: retrying URI="ldap://localhost:1002"; DN="".
conn=1001 op=1 meta_back_retry[1]: meta_back_single_dobind=52
conn=1001 op=1 SEARCH RESULT tag=101 err=0 nentries=4 text=
conn=1001 op=2 UNBIND
conn=1001 fd=9 closed
=> looks good as nentries=4

# perform numerous new search without changing anything:
[root@pp-ae2-proxy2 log]# /opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
=> nothing returned

Below are the corresponding logs:
conn=1002 fd=9 ACCEPT from IP=172.30.8.13:55049 (IP=0.0.0.0:1000)
conn=1002 op=0 BIND dn="" method=128
conn=1002 op=0 RESULT tag=97 err=0 text=
conn=1002 op=1 SRCH base="ou=orgunit,o=gouv,c=fr" scope=2 deref=0 filter="(objectClass=person)"
conn=1002 op=1 SRCH attr=dn
conn=1002 op=1 meta_search_dobind_init[1]: retrying URI="ldap://localhost:1002"; DN="".
conn=1002 op=1 SEARCH RESULT tag=101 err=0 nentries=0 text=
conn=1002 op=2 UNBIND
conn=1002 fd=9 closed
=> looks bad as nentries=0
=> Only the first search after server2 stop is successfull.

# new search but using server1 ou:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=dept1,ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
=> looks good

# same search as earlier i.e. using root node:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
=> Looks good also. It looks like all is OK but only once a channel is opened to server1 using another manner

# new search but using server3 base object:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=dept3,ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> looks good

# new search but using slapd-meta base object:
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> entries from server1 and server3 are returned
=> this confirms lookups in server1 and server3 are not performed until a channel is opened to both of them using their repective base object
=> another strange behavior : if search using server3 ou is performed before serach using server ou, then next search attempt using root node allows to retrieve entries from both server1 and server3 ...

# new search after server2 restart: 
/opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
dn: uid=user11,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user12,ou=dept1,ou=orgunit,o=gouv,c=fr
dn: uid=user21,ou=dept2,ou=orgunit,o=gouv,c=fr
dn: uid=user31,ou=dept3,ou=orgunit,o=gouv,c=fr
dn: uid=user22,ou=dept2,ou=orgunit,o=gouv,c=fr
dn: uid=user32,ou=dept3,ou=orgunit,o=gouv,c=fr
=> good, all entries are returned

# new search after meta instance restart while server2 is already stopped 
opt/openldap/bin/ldapsearch -LLL -x -H ldap://pp-ae2-proxy1.alize:1000 -b ou=orgunit,o=gouv,c=frÂ objectclass=person dnÂ |grep dn:
=> unlike the previous test case (server2 stopped while meta instance is already running) we do not see the single successfull search.

Then, behaviour is the same i.e. search on root node works again, but only once a search has been performed using ou=dept1 and ou=dept3.

In addition, behaviour is slightly different adding the "conn-ttl" parameter set to 3 (3 seconds). I could expose it in a new post.

Thanks for anyone who could help to identify whether it is a misconfiguration or a bug.

Michel Gruau

> Message du 19/08/11 13:13
> De : "Michel Gruau" 
> A : "openldap-technical openldap org" 
> Copie Ã : 
> Objet : Slapd-meta stop at the first unreachable candidate
>
> Hello,

It have a slapd-meta configuration as follows: 

database meta
suffix dc=com
uri ldap://server1:389/dc=suffix1,dc=com
uri ldap://server2:389/dc=suffix2,dc=com
uri ldap://server3:389/dc=suffix3,dc=com

I performed numerous tests using "base=com" and changing the order of the above list of uri (in slapd.cnof) and I see that as soon as a candidate directory is unreachable, all other directories located below the directory in failure are not requested by the proxy. For instance, in example below:
- if server2 is down, then server 3 is not requeted
- if server1 is down, then none of the directories is requested.

I have the felling this is a bug ... could you confirm ?

FYI, I also tried the "'onerrr continue" config, but did not change annything

Thanks in advance.

Michel

> 
>

Une messagerie gratuite, garantie Ã vie et des services en plus, Ãa vous tente ?
Je crÃe ma boÃte mail www.laposte.net
Prev by Date: Re: N-way multi master configuration issue
Next by Date: Re: Slapd-meta stop at the first unreachable candidate
Index(es):
- Chronological
- Thread