[Date Prev][Date Next] [Chronological] [Thread] [Top]

(ITS#8044) openldap 2.4.39-8.el6: issue causing server unavailability



Full_Name: Diana Scannicchio
Version: 2.4.39-8.el6
OS: EL6 (SLC6 - Scientific Linux CERN 6)
URL: https://scannicc.web.cern.ch/scannicc/openldap/
Submission from: (NULL) (128.141.46.221)


Dear experts,
I configured 1 LDAP provider and 9 LDAP consumers to serve a large system (~3000
nodes and ~4000 users) and answer thousands of requests.
In November 2014 I upgraded the openldap version from 2.4.23-34.el6_5.1 to
2.4.39-8.el6 and we started to experience some issue.
We need to regularly modify the LDAP content (e.g. netgroups, sudo rules) and
doing it we randomly started to get the message

ldapmodify: Server is unavailable (52)

and correspondingly in the log on the consumers we find

conn=22818341 op=1 ldap_back_retry: retrying
URI="ldap://vm-atlas-ldap-1.cern.ch:389"; DN="cn=manager,ou=atlas,o=cern,c=ch" 
slapd[6175]: Error: ldap_back_is_proxy_authz returned 0, misconfigured URI?

After some debugging and tests I think that this is due to the connection being
closed by the provider 

tcp        1      0 consumer:36812 provider:ldap CLOSE_WAIT  3254/slapd

and the consumers not being able to reconnect at the first request.
Indeed when the consumer receives a first request it fails to connect to the
provider and it manages to open the connection only at the second request.
The connection then stays open if requests are sent continuously and then after
6 minutes (the idletimeout) it is closed by the provider.
 
The fact that each time the first request fails makes the system unusable. I
temporary downgraded openldap back to use version 2.4.23-34.el6_5.1 with which,
we not not have this issue.

Could you please help us in understanding how to fix the issue? and/if I am
configuring something wrongly?
what has been changed between 2.4.23-34.el6_5.1 to 2.4.39-8.el6?
is there some parameter that could be set more properly? 

You can find the slapd.conf configuration files used for the provider and for
the consumers at
https://scannicc.web.cern.ch/scannicc/openldap/

Any help and suggestion is very welcome.
Please let me know if you need more information.
Thank you very much and best regards,

Diana

P.S. the logs are also filled with
slapd[1377]: connection_read(36): no connection!
but this was present also with the previous openldap versions...