[Date Prev][Date Next] [Chronological] [Thread] [Top]

Long query times: OpenLDAP 1.2.7 + BDB 2.7.7 + Solaris 7

I've recently built OpenLDAP 1.2.7 with Berkeley DB 2.7.7 using

env \
                 ac_cv_func_pthread_create=no ol_cv_kthread_flag=no \
                 ol_cv_pthread_flag=no ol_cv_pthreads_flag=no \
                 ol_cv_thread_flag=no \
                 CPPFLAGS="-I/usr/local/BerkeleyDB/include" \
                 LDFLAGS="-L/usr/local/BerkeleyDB/lib" \
         ./configure \

as per http://www.openldap.org/faq/data/cache/160.html

on a Sun E450 / Solaris 7 that with two LDAP servers has a load of around

The server gets about 10queries/sec and is serving a directory around 60,000
entries. The queries typically either return immediately or take a long time,
e.g. sometimes over a minute to return results as though it was blocking
somewhere--the bind and result can also come a long time after the initial
query is issued. Virtual Adrian, Sun's free diagnostic tool, reports a lot of
"Listen queue dropout, speedup accept processing" warnings. I have increased
the listen queue beyond Sun's original amounts as per
http://www.rvs.uni-hannover.de/people/voeckler/tune/EN/tune.html (which is
superb) but I'm not sure the issue is really here. The Innosoft directory
server running on this machine isn't suffering TCP issues.

Has anyone else experienced these kind of problems? Anyone feeling confident
I'm more than happy to start sending more specific, directed diagnostic &
debug output (I am collecting -d 65535 info now).

It's pausing for several seconds or more during which time CPU util drops to
almost nothing on:
(random samples of tail -f pausing):

=> string_expand: expanded: ^1[^2].*
=> regex_matches: string:
=> regex_matches: rc: 0 matches
<= acl_access_allowed: matched by clause #2 access granted

=> id2entry_add( 10588, "cn=Luiz,ou=employee,o=SLB,c=AN" )
=> ldbm_cache_open( "/export/ldap/openldap/dbm/replica/id2entry.dbb", 7,
600 )
<= ldbm_cache_open (cache 0)
<= id2entry_add 0
send_ldap_result 0::

=> id2entry_add( 1219, "cn=Debbie,ou=employee,o=SLB,c=AN" )
=> ldbm_cache_open( "/export/ldap/openldap/dbm/replica/id2entry.dbb", 7,
600 )
<= ldbm_cache_open (cache 0)
<= id2entry_add 0
send_ldap_result 0::

=> id2entry_add( 50436, "cn=Lindsay,ou=employee,o=SLB,c=AN" )
=> ldbm_cache_open( "/export/ldap/openldap/dbm/replica/id2entry.dbb", 7,
600 )
ldbm_cache_open (blksize 8192) (maxids 2046) (maxindirect 2)
<= ldbm_cache_open (opened 0)
<= id2entry_add 0
send_ldap_result 0::

...after which is resumes copious debug output altho response time is still
v. slow.

When I ran it with ulimit -n 64 (default) I very rapidly found:

select activity on 4 descriptors
accept() failed errno 24 (Too many open files)listening for connections on 3,
activity on: 12r 15r 16r 18r 19r 21r 22r 23r 24r 25r 26r 27r 28r 29r 30r 31r
32r 33r 34r 35r 36r 37r 38r 39r 40r 41r 42r 43r 44r 45r 46r 47r 48r 49r 50r
51r 52r 53r 54r 55r 56r 57r 58r 59r 60r 61r 62r 63r
before select active_threads 0

Presumably building up quickly since it wasn't responding to queries as fast
as one'd expect.


Thus spake the Master Programmer:
  "Let the programmers be many and the managers few --
    then all will
      be productive." (http://misspiggy.gsfc.nasa.gov/tao.html)