[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: thread problem OpenLDAP 2.1.8 + Solaris 8

--On Sunday, December 01, 2002 9:34 AM -0800 "Kurt D. Zeilenga" <Kurt@OpenLDAP.org> wrote:

At 09:41 AM 2002-11-26, Quanah Gibson-Mount wrote:

I note that addressing posts to an individual will often result in your post not being responded to by others on this list...

That had been done somewhat intentionally due to the patches. However, everyone please consider this reply open to comments from all. ;)

Having now done much tweaking with openldap, we are now up to a dismal
average of 14 queries/second with 20 hosts over a 74.5 minute period.

I am fairly positive we are indexing everything necessary.  All we are
doing right now is filtering on suseassunetid to find the sumaildrop
attribute.  Both of those attributes are indexed, as are objectClass,
entry, uid, dc, and cn.  Example query: ldapsearch -h <host>
-b"cn=accounts,dc=stanford,dc=edu" suseassunetid=quanah sumaildrop

I note that a query like this is actually quite complex... and includes all the TCP establishment, maybe DNS reverse lookups, SASL authentication and security layer establishment, identity mapping (possible with search indirection), .... That is, a lot more than a single LDAP search operation.

A few things we have noticed:

One:  slapd spends a lot of time determining the ACL's for attributes,
every single time a query is made, instead of caching the ACL's.  I am
interested in its #1523, but do not have access to the patches.  Would
it be possible to send them to me, so I can see if that improves our
performance any?

Seems the patch was uploaded under a different name. I've added a symlink to the repository so that URL in the ITS now works.

Thanks. I've updated it to patch against OpenLDAP-2.1.8. Would you like the new patches put anywhere? Unfortunately, this acl caching is only per connection, rather than persistent across connections, so it doesn't do what I was hoping.

Two:  slapd will be very responsive and quick in its answers for a
period of time, then will just hang for several seconds, or greatly slow
down the rate at which it is returning queries.  It will then pick up
again, being quite responsive.

I would look at three areas were such can easily occur. One, something in the connection accept code (which is single threaded) hanging up on something... like reverse DNS lookups. Two, something in the lower-level authentication framework (SASL,GSSAPI,Kerberos) hanging things up. Three, contention for DB pages (if using back-bdb) or the LDBM giant r/w lock (if using back-ldbm).

We run a local dns caching system, so reverse lookups should not be a problem. Regardless, I turned off reverse-lookups, and ran our tests, and I still had delays of up to 5 seconds between starting a search & getting a result (and that was with just a single machine querying). Also, given the fact that we see the same cyclical pausing issues when we do tests where we are doing anonymous binds, I believe that SASL/GSSAPI/Kerberos is not at issue either. I do agree that contention for the DB could possibly be an issue. We are hoping to have a consultant in in the near future to help with the DB bit. I unfortunately am not clear on how to read the conflict matrix that is created by Berkeley DB. I will take a look at the TCP/IP connections being in TIME_WAIT as well.


Quanah Gibson-Mount
Senior Systems Administrator
ITSS/TSS/Computing Systems
Stanford University
GnuPG Public Key: http://www.stanford.edu/~quanah/pgp.html