[Date Prev][Date Next] [Chronological] [Thread] [Top]

openldap stops responding after some time



Hello list,

i hope you can help me with my problem.

To my setup:

All servers are OpenLDAP 2.4.42

I have an master LDAP server, which replicates with standard syncrepl to an consumer ldap.
On this consumer ldap server i have configured an standalone slapd proxy ldap with slapd-ldap which pushes changes to more than 6000 consumer ldaps.

There are more ldap proxys running, with each 500 consumers to reduce startup time.

The master and slave are connected via TCP, and the ldap proxys are on the slave via socket.

Everything works fine and changes are replicated in realtime to the consumers behind the proxy, but after some time ( about 20 to 30 minutes ) the slave ldap just hangs and isnt responding anymore.
A short time before it hangs the changes are pushed with an long delay, before it hangs fully.
With debug on ( -d256 ) everything looks fine and no error is displayed, but it hangs.

I have tested the standard syncrepl and delta syncrepl with the same result. When strace the process there are only many futex_wait() 
While i write this mail the error doesnt occur, so i am not able to paste an strace.

So then. Has anyone an idea to this problem or an better solution for my setup ?
Any hints to debug this, or some tips and tricks would be really nice. 



Here are the relevant configuration settings of all servers:

## all ldap servers are started with extended limits in systemd
LimitCORE=0
LimitNPROC=5000000
LimitNOFILE=65535
LimitSTACK=81920
LimitDATA=infinity
LimitMEMLOCK=infinity
LimitRSS=infinity
LimitAS=infinity

and: echo 5000000 > /proc/sys/kernel/threads-max

Cause limits in openldap itself i have patched it too:

diff -rNu openldap-2.4.42.orig/libraries/libldap_r/tpool.c openldap-2.4.42/libraries/libldap_r/tpool.c
--- openldap-2.4.42.orig/libraries/libldap_r/tpool.c    2015-08-31 08:26:55.000000000 +0200
+++ openldap-2.4.42/libraries/libldap_r/tpool.c 2015-08-31 07:39:25.000000000 +0200
@@ -42,10 +42,10 @@
 /* Max number of thread-specific keys we store per thread.
  * We don't expect to use many...
  */
-#define        MAXKEYS 32
+#define        MAXKEYS 65535

 /* Max number of threads */
-#define        LDAP_MAXTHR     1024    /* must be a power of 2 */
+#define        LDAP_MAXTHR     65535   /* must be a power of 2 */

 /* (Theoretical) max number of pending requests */
 #define MAX_PENDING (INT_MAX/2)        /* INT_MAX - (room to avoid overflow) */
diff -rNu openldap-2.4.42.orig/servers/slapd/daemon.c openldap-2.4.42/servers/slapd/daemon.c
--- openldap-2.4.42.orig/servers/slapd/daemon.c 2015-08-31 08:25:42.000000000 +0200
+++ openldap-2.4.42/servers/slapd/daemon.c      2015-08-31 07:42:02.000000000 +0200
@@ -1635,6 +1635,7 @@
 #else /* ! HAVE_SYSCONF && ! HAVE_GETDTABLESIZE */
        dtblsize = FD_SETSIZE;
 #endif /* ! HAVE_SYSCONF && ! HAVE_GETDTABLESIZE */
+       dtblsize=8192;

        /* open a pipe (or something equivalent connected to itself).
         * we write a byte on this fd whenever we catch a signal. The main



And raised the max integer numbers of syncrepl´s rid=


### Master ###########

loglevel 0
sizelimit unlimited

database        mdb
suffix          "o=company, c=de"
rootdn          "cn=Manager,o=company,c=de"
rootpw          "xxxxxxxxxxxxxxxxxxxxxxxx"
 
overlay syncprov
syncprov-checkpoint 100 10
syncprov-sessionlog 1000

index DFan,DFname,uid,uidNumber,gidNumber,DFCronjobID eq
index entryUUID,entryCSN eq
index objectClass eq
directory       /var/lib/ldap/openldap-mdb
maxsize 8500000000

#### Slave ##################

loglevel 0
threads 2048

database        mdb
suffix          "o=company, c=de"
rootdn          "cn=Manager,o=company,c=de"
rootpw          "xxxxxxxxxxxxxxxxxxxx"

# here are all consumer ldap servers one by one

access to dn.subtree="sid=240,sec=webhosting,o=company,c=de"
 by dn.exact="cn=replicator,sid=240,sec=webhosting,o=company,c=de" write
 by * auth

access to dn.subtree="sid=241,sec=webhosting,o=company,c=de"
 by dn.exact="cn=replicator,sid=241,sec=webhosting,o=company,c=de" write
 by * auth

access to dn.subtree="sid=242,sec=webhosting,o=company,c=de"
 by dn.exact="cn=replicator,sid=242,sec=webhosting,o=company,c=de" write
 by * auth

...
...

index DFan,DFname,uid,uidNumber,gidNumber,DFCronjobID eq
index entryUUID,entryCSN eq
index objectClass eq
directory       /var/lib/ldap/openldap-mdb

syncrepl rid=001
         provider=ldaps://ldapmaster:636/
         binddn="cn=Manager,o=company,c=de"
         bindmethod=simple
         credentials=xxxxxxxxxxxxxxxxxxxxxx
         searchbase="o=company,c=de"
         type=refreshAndPersist
         retry="5 5 300 5"

overlay syncprov
syncprov-checkpoint 1000 60

maxsize 8500000000
maxreaders 12000


##### SLAPD Proxy #####################

database ldap
hidden on
suffix "sid=240,sec=webhosting,o=company,c=de"
rootdn "cn=replicator,sid=240,sec=webhosting,o=company,c=de"
uri ldaps://sid240.int.webslave.company.de:636
lastmod on
restrict all

acl-bind        bindmethod=simple
                binddn="cn=replicator,sid=240,sec=webhosting,o=company,c=de"
                credentials="xxxxxxxxxxxxxxxxxxxxx"

syncrepl        rid=001
                provider=ldapi://
                binddn="cn=Manager,o=company,c=de"
                bindmethod=simple
                credentials=xxxxxxxxxxxxxxxxxxxxxxxxx
                searchbase="sid=240,sec=webhosting,o=company,c=de"
                type=refreshAndPersist
                retry="5 5 300 5"

overlay syncprov

# next one
database ldap
hidden on
suffix "sid=241,sec=webhosting,o=company,c=de"
rootdn "cn=replicator,sid=241,sec=webhosting,o=company,c=de"
uri ldaps://sid241.int.webslave.company.de:636
lastmod on
restrict all

acl-bind        bindmethod=simple
                binddn="cn=replicator,sid=241,sec=webhosting,o=company,c=de"
                credentials="xxxxxxxxxxxxxxxxxxxxx"

syncrepl        rid=001
                provider=ldapi://
                binddn="cn=Manager,o=company,c=de"
                bindmethod=simple
                credentials=xxxxxxxxxxxxxxxxxxxxxxxxx
                searchbase="sid=241,sec=webhosting,o=company,c=de"
                type=refreshAndPersist
                retry="5 5 300 5"

overlay syncprov
 ...


#### and the 6300 consumers on the end ###############

database        mdb
suffix          "sid=240,sec=webhosting,o=company,c=de"
rootdn          "cn=replicator,sid=240,sec=webhosting,o=company,c=de"
rootpw          {SSHA}xxxxxxxxxxxxxxxx
index DFan,DFname,DFdnumber,sid,uid,uidNumber,gidNumber,DFCronjobID eq
index objectClass eq
index entryUUID,entryCSN eq
directory /var/lib/ldap/openldap-mdb/sid240
updatedn "cn=replicator,sid=240,sec=webhosting,o=company,c=de"
maxsize 1073741824
subordinate

updateref ldaps://ldapmaster:636

database        mdb
suffix          "o=company,c=de"
rootdn "cn=Manager,o=company,c=de"
rootpw {SSHA}xxxxxxxxxxxxxxxx
index objectClass eq
directory /var/lib/ldap/openldap-mdb/rest


Regards,

Daniel Betz
System Design Engineer / Senior Systemadministration 
___________________________________

domainfactory GmbH
Oskar-Messter-Str. 33
85737 Ismaning
Germany

Telefon:  +49 (0)89 / 55266-364
Telefax:  +49 (0)89 / 55266-222

E-Mail:   dbetz@df.eu
Internet: www.df.eu

Registergericht: Amtsgericht München
HRB-Nummer 150294, Geschäftsführer:
Tobias Mohr, Stephan Wolfram