[Date Prev][Date Next] [Chronological] [Thread] [Top]

Please comment




I have a few fairly minor modifications I'd like to commit, but first I'd like to get some comments. I've done some profiling on Unix and Windows NT, using the LDBM backend, and have found three changes that significantly improve performance:


1) the ldbm backend is flushing caches too often. Adding a simple "dirty" flag and flushing only when the DB is dirty noticeably improves performance. My guess is that the sleepy backend will be the better option going forward, but many people have ldbm-based servers already, and this simple change gives a benefit to those people.

2) currently, a new thread is created for each operation, which introduces huge overhead. I've implemented simple thread-pooling for the NT thread code, and have seen 5-fold performance improvement. The changes I've made are limited to thr_nt.c, which has the benefit that testing (and integration) is easier, but the drawback that other thread modules still use the same 1 thread per operation model. It's easy to port the thread pooling code to the other thread models, and I haven't looked at how easy it would be to push the thread-pooling code "outside" of the platform-specific thread code. Any comments here? Preferences?

3) Here's a neat test: run a bunch of clients against four LDAP servers that are identical except for one thing. In one, use the ACL:

   access to * by * write

in another, use:

   access to * by dn="cn=YourName, ou=People, dc=..." write

and in another use:

   access to * by dn=".*ou=People, dc=..." write

and in the last use:

   access to * by dn=".*ou=.*, dc=.*, ..." write

Obviously, the DNs above should make sense for the setup. The idea in the last case is to have a "complex" regex.

What you'll see is that performance on the last three cases is about the same, while the first case is *much* faster, relatively speaking.

I've tested some simple changes that special case the two middle examples above to do a simple DN match and a DN suffix match, respectively. In this case, I've seen a factor of 10 improvement in performance (obviously, the improvement depends on what your ACLs look like).

My implementation is sort of a hack, to make it easier to integrate the changes into the official source. If nobody protests, I will add this special case code to the official source, but will clean it up a bit for better maintainability.