[Date Prev][Date Next] [Chronological] [Thread] [Top]

frequent slapd freeze with openldap 2.4.13



Hello list.

Since a recent upgrade 2.4.12 -> 2.4.13, I'm facing recurrent slapd hanging.

On client side, ldapsearch requests receive this error:
error.c:272: ldap_parse_result: Assertion `r != ((void *)0)' failed


I'd expect in this case an automatic switch to slave server, but it doesn't work. Here is my ldap libraries configuration:
BASE dc=msr-inria,dc=inria,dc=fr
URI ldap://ldap1.msr-inria.inria.fr ldap://ldap2.msr-inria.inria.fr
TLS_CACERTDIR /etc/pki/tls/certs
TLS_REQCERT demand
NETWORK_TIMEOUT 2
TIMEOUT 2
TIMELIMIT 2


On server side, slapd usually shows eating 100, 200 or 300% cpu, which make me think some specific repeated query trigger the issue, making the problem worse when several of them accumulates.

strace on running slapd process shows it's waiting on a futex:
[root@etoile main]# strace -p 2769
Process 2769 attached - interrupt to quit
futex(0xb6bb4bd8, FUTEX_WAIT, 2774, NULL <unfinished ...>

And gdb shows it waiting in __kernel_vsyscall
(gdb) bt
#0  0xffffe410 in __kernel_vsyscall ()
#1  0xb7d385c6 in pthread_join () from /lib/i686/libpthread.so.0
#2  0xb7f23d3f in ldap_pvt_thread_join () from /usr/lib/libldap_r-2.4.so.2
#3  0x0806e1b4 in slapd_daemon ()
#4  0x0805a507 in main ()

In both case, I think the lack of relevant information is caused by the multithreading nature of slapd, I don't know how to access the exact thread where the problem occurs.

I already tried to regenerate indexes, without results. I dropped the base, and reconstructed it from latest backup, it made the problem temporarily disapear. I didn't found anything in the logs, even with debug level set to 'trace'.

I'm using a bdb backend, with this configuration in slapd.conf:
database        bdb
suffix          "dc=msr-inria,dc=inria,dc=fr"
rootdn          "cn=root,dc=msr-inria,dc=inria,dc=fr"
#rootpw         root
directory       /var/lib/ldap/main

cachesize 1000
idlcachesize 1000
checkpoint 256 5

And this one in DB_CONFIG:
set_cachesize   0       1048576        0
set_lg_bsize 2097152
set_lg_max 10485760
set_flags DB_LOG_AUTOREMOVE

The full slapd.conf is accessible at http://pastebin.mandriva.com/5801
db_stat -m output is accessible at http://pastebin.mandriva.com/5799

The main database itself is quite small, the ldiff backup is 1.4 only. I also have a log database for syncrepl purpose.

I'm using openldap 2.4.13, with db 4.6.21, on mandriva linux 2008.1, 32 bits system. I'd be happy to provide additional informations if needed.

--
Guillaume Rousse
Service des Moyens Informatiques
INRIA Saclay - Île-de-France
Parc Orsay Université, 4 rue J. Monod
91893 Orsay Cedex France
Tel: 01 69 35 69 62