[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#3464) back-meta and deadlock



sebastien.georget@sophia.inria.fr wrote:

>Full_Name: Sébastien Georget
>Version: 2.2.20
>OS: linux
>URL: ftp://ftp.openldap.org/incoming/
>Submission from: (NULL) (138.96.66.17)
>
>
>Given the following (simplified) configuration :
>
>===
>threads 2
>
>database        bdb
>suffix          "o=A,dc=domain"
>
>database        meta
>suffix          "dc=domain"
>uri             "ldap://127.0.0.1/o=A,dc=domain";
>===
>
>slapd(-meta) hangs when two clients try to (simultaneously) do the following
>sequence :
>- anonymous binding
>- search with base dc=domain
>- unbinding
>
>In debug mode a <ctrl-c> results in :
>slapd shutdown: waiting for 4 threads to terminate
>
>And the slapd daemon must be killed with SIGKILL.
>
>Notes :
>- there is no problem doing a search with base o=A,dc=domain
>- the problem doesn't occur when the bdb and the meta backends are handled by
>two different slapd
>- the search filter doesn't seem to have an impact
>
>
>The deadlock may result from the fact that slapd(-meta) accepts two connections
>from the clients and 'forward' them to the bdb backend who cannot answer because
>the max threads limit is reached.
>  
>
This is where slapd hangs; it's HEAD code, after I modified back-meta to 
use the asynchronous ldap_sasl_bind(), with a timeout on ldap_result() 
and retry after each timeout failure.

(gdb) thread apply all bt

Thread 4 (Thread -1229780048 (LWP 10912)):
#0  0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0xb72a8f81 in ___newselect_nocancel () from /lib/tls/libc.so.6
#2  0x0807e167 in slapd_daemon_task (ptr=0x0) at daemon.c:1691
#3  0xb73acdac in start_thread () from /lib/tls/libpthread.so.0
#4  0xb72afa8a in clone () from /lib/tls/libc.so.6

Thread 3 (Thread -1233978448 (LWP 10914)):
#0  0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0xb72a6a27 in poll () from /lib/tls/libc.so.6
#2  0x08162848 in ldap_int_select (ld=0x835eac8, timeout=0xb672e618)
    at os-ip.c:909
#3  0x0814e7ad in wait4msg (ld=0x835eac8, msgid=1, all=0, 
timeout=0xb672e690,
    result=0xb672e698) at result.c:302
#4  0x0814e2c5 in ldap_result (ld=0x835eac8, msgid=1, all=0,
    timeout=0xb672e690, result=0xb672e698) at result.c:122
#5  0x0813153a in meta_back_dobind (lc=0x834d3d0, op=0x834f1d0) at 
bind.c:354
#6  0x080f8306 in meta_back_search (op=0x834f1d0, rs=0xb672f870) at 
search.c:80
#7  0x08083579 in fe_op_search (op=0x834f1d0, rs=0xb672f870) at search.c:413
#8  0x08082f35 in do_search (op=0x834f1d0, rs=0xb672f870) at search.c:223
#9  0x08080d6e in connection_operation (ctx=0xb672f900, arg_v=0x834f1d0)
    at connection.c:1038
#10 0x0814cd72 in ldap_int_thread_pool_wrapper (xpool=0x82d7418) at 
tpool.c:467
#11 0xb73acdac in start_thread () from /lib/tls/libpthread.so.0
#12 0xb72afa8a in clone () from /lib/tls/libc.so.6

Thread 2 (Thread -1247622224 (LWP 10921)):
#0  0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0xb72a6a27 in poll () from /lib/tls/libc.so.6
#2  0x08162848 in ldap_int_select (ld=0x8373178, timeout=0xb5a2b618)
    at os-ip.c:909
#3  0x0814e7ad in wait4msg (ld=0x8373178, msgid=1, all=0, 
timeout=0xb5a2b690,
    result=0xb5a2b698) at result.c:302
#4  0x0814e2c5 in ldap_result (ld=0x8373178, msgid=1, all=0,
    timeout=0xb5a2b690, result=0xb5a2b698) at result.c:122
#5  0x0813153a in meta_back_dobind (lc=0x8352e80, op=0x8373760) at 
bind.c:354
#6  0x080f8306 in meta_back_search (op=0x8373760, rs=0xb5a2c870) at 
search.c:80
#7  0x08083579 in fe_op_search (op=0x8373760, rs=0xb5a2c870) at search.c:413
#8  0x08082f35 in do_search (op=0x8373760, rs=0xb5a2c870) at search.c:223
#9  0x08080d6e in connection_operation (ctx=0xb5a2c900, arg_v=0x8373760)
    at connection.c:1038
#10 0x0814cd72 in ldap_int_thread_pool_wrapper (xpool=0x82d7418) at 
tpool.c:467
#11 0xb73acdac in start_thread () from /lib/tls/libpthread.so.0
#12 0xb72afa8a in clone () from /lib/tls/libc.so.6

Thread 1 (Thread -1222863040 (LWP 10910)):
#0  0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0xb73adc1d in pthread_join () from /lib/tls/libpthread.so.0
#2  0x0814d798 in ldap_pvt_thread_join (thread=3065187248, 
thread_return=0x0)
    at thr_posix.c:165
#3  0x0807ebbb in slapd_daemon () at daemon.c:2035
#4  0x08074405 in main (argc=7, argv=0xbfffd034) at main.c:772
#0  0xb75ebc32 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2

A partial fix is in HEAD (abort after few retries frees a thread).

p.



    SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497