[Date Prev][Date Next] [Chronological] [Thread] [Top]

HEAD is hanging quite frequently...



I'm seeing frequent occurrences of slapd getting stuck during trivial
operations; this is, for example, test003.  The problem is quite
repeatable, although some times it hangs in a slightly different place.
I haven't been able to complete it in tens of attempts, though.

[masarati@ando tests]$ SLAPD_DEBUG=256 ./run test003
Cleaning up test run directory leftover from previous run.
Running ./scripts/test003-search...
running defines.sh
Running slapadd to build slapd database...
Running slapindex to index slapd database...
Starting slapd on TCP/IP port 9011...
Testing slapd searching...
Testing exact searching...
Testing approximate searching...
Testing OR searching...
Testing AND matching and ends-with searching...
Testing NOT searching...
[hung...]

(gdb) thread apply all bt

Thread 4 (Thread 1082132832 (LWP 31897)):
#0  0x0000002a962e3f1c in epoll_wait () from /lib64/tls/libc.so.6
#1  0x0000000000438383 in slapd_daemon_task (ptr=0x0) at daemon.c:1803
#2  0x0000002a9610f0aa in start_thread ()
from /lib64/tls/libpthread.so.0
#3  0x0000002a962e3b43 in clone () from /lib64/tls/libc.so.6
#4  0x0000000000000000 in ?? ()

Thread 3 (Thread 1090525536 (LWP 31899)):
#0  0x0000002a961118da in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/tls/libpthread.so.0
#1  0x00000000005b0ce8 in ldap_pvt_thread_cond_wait (cond=0x80bf30,
    mutex=0x80bf08) at thr_posix.c:282
#2  0x00000000005afa8f in ldap_int_thread_pool_wrapper (xpool=0x80bf00)
    at tpool.c:607
#3  0x0000002a9610f0aa in start_thread ()
from /lib64/tls/libpthread.so.0
#4  0x0000002a962e3b43 in clone () from /lib64/tls/libc.so.6
#5  0x0000000000000000 in ?? ()

Thread 2 (Thread 1098918240 (LWP 31900)):
#0  0x0000002a961118da in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/tls/libpthread.so.0
#1  0x00000000005b0ce8 in ldap_pvt_thread_cond_wait (cond=0x80bf30,
    mutex=0x80bf08) at thr_posix.c:282
#2  0x00000000005afa8f in ldap_int_thread_pool_wrapper (xpool=0x80bf00)
    at tpool.c:607
#3  0x0000002a9610f0aa in start_thread ()
from /lib64/tls/libpthread.so.0
#4  0x0000002a962e3b43 in clone () from /lib64/tls/libc.so.6
#5  0x0000000000000000 in ?? ()

Thread 1 (Thread 182910803744 (LWP 31895)):
#0  0x0000002a9610ff2b in pthread_join ()
from /lib64/tls/libpthread.so.0
#1  0x00000000005b0c40 in ldap_pvt_thread_join (thread=1082132832,
    thread_return=0x0) at thr_posix.c:186
#2  0x0000000000439086 in slapd_daemon () at daemon.c:2179
#3  0x0000000000423c08 in main (argc=8, argv=0x7fbffff168) at main.c:785

The output is:

@(#) $OpenLDAP: slapd 2.X (Oct 16 2005 09:00:22) $

cat testrun/slapd.1.log
masarati@ando:/home/masarati/Lavoro/sysnet/Ldap/ldap/servers/slapd
bdb_db_open: Warning - No DB_CONFIG file found in
directory ./testrun/db.1.a: (2)
Expect poor performance for suffix dc=example,dc=com.
slapd starting
conn=0 fd=11 ACCEPT from IP=127.0.0.1:33162 (IP=127.0.0.1:9011)
conn=0 op=0 BIND dn="" method=128
conn=0 op=0 RESULT tag=97 err=0 text=
conn=0 op=1 SRCH base="" scope=0 deref=0 filter="(objectClass=*)"
conn=0 op=1 SEARCH RESULT tag=101 err=0 nentries=1 text=
conn=1 fd=12 ACCEPT from IP=127.0.0.1:33163 (IP=127.0.0.1:9011)
conn=0 op=2 UNBIND
conn=0 fd=11 closed
conn=1 op=0 BIND dn="" method=128
conn=1 op=0 RESULT tag=97 err=0 text=
conn=1 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0
filter="(sn=jensen)"
conn=1 op=1 SEARCH RESULT tag=101 err=0 nentries=2 text=
conn=1 op=2 UNBIND
conn=1 fd=12 closed
connection_read(12): no connection!
conn=2 fd=12 ACCEPT from IP=127.0.0.1:33164 (IP=127.0.0.1:9011)
conn=2 op=0 BIND dn="" method=128
conn=2 op=0 RESULT tag=97 err=0 text=
conn=2 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0
filter="(sn~=jensen)"
conn=2 op=1 SRCH attr=name
<= bdb_approx_candidates: (sn) index_param failed (18)
conn=2 op=1 SEARCH RESULT tag=101 err=0 nentries=2 text=
conn=2 op=2 UNBIND
conn=2 fd=12 closed
conn=3 fd=12 ACCEPT from IP=127.0.0.1:33165 (IP=127.0.0.1:9011)
conn=3 op=0 BIND dn="" method=128
conn=3 op=0 RESULT tag=97 err=0 text=
get_filter: conn 3 unknown attribute type=undef (17)
conn=3 op=1 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(|
(givenName=xx*yy*z)(?=undefined)(?
=false)(objectClass=groupOfNames)(sn=jones)(member=cn=manager,dc=example,dc=com)(uniqueMember=cn=manager,dc=example,dc=com))"
<= bdb_substring_candidates: (givenName) index_param failed (18)
<= bdb_equality_candidates: (member) index_param failed (18)
<= bdb_equality_candidates: (uniqueMember) index_param failed (18)
conn=3 op=1 SEARCH RESULT tag=101 err=0 nentries=4 text=
conn=3 op=2 UNBIND
conn=3 fd=12 closed
connection_read(12): no connection!
conn=4 fd=12 ACCEPT from IP=127.0.0.1:33166 (IP=127.0.0.1:9011)
conn=4 op=0 BIND dn="" method=128
conn=4 op=0 RESULT tag=97 err=0 text=
conn=4 op=1 SRCH base="ou=groups,dc=example,dc=com" scope=1 deref=0
filter="(&(objectClass=groupOfNames)(cn=a*)(member=cn=mark
elliot,ou=alumni association,ou=people,dc=example,dc=com))"
<= bdb_equality_candidates: (member) index_param failed (18)
conn=4 op=1 SEARCH RESULT tag=101 err=0 nentries=2 text=
conn=4 op=2 UNBIND
conn=4 fd=12 closed
connection_read(12): no connection!

I note those strange messages about connection_read (no writing seems to
be occurring at all, during this test).

The software/hardware is a CentOS 4.2 (RehHat AS4 clone) running on
AMD64.

Any clues?  I'm not submitting an ITS yet because i suspect something
trivial on my side.

p.



    SysNet - via Dossi,8 27100 Pavia Tel: +390382573859 Fax: +390382476497