[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Please comment

At 07:32 PM 6/4/00 -0400, Mark Valence wrote:
>I have a few fairly minor modifications I'd like to commit, but first 
>I'd like to get some comments.  I've done some profiling on Unix and 
>Windows NT, using the LDBM backend, and have found three changes that 
>significantly improve performance:
>1) the ldbm backend is flushing caches too often.  Adding a simple 
>"dirty" flag and flushing only when the DB is dirty noticeably 
>improves performance.

If the dirty flag is a true (that is, we sync after each
and very cache modify), I have no objection.  Safety over
speed, please.

>My guess is that the sleepy backend will be 
>the better option going forward, but many people have ldbm-based 
>servers already, and this simple change gives a benefit to those 

Yes, or other reliable backend.

>2) currently, a new thread is created for each operation, which 
>introduces huge overhead.  I've implemented simple thread-pooling for 
>the NT thread code, and have seen 5-fold performance improvement. 
>The changes I've made are limited to thr_nt.c, which has the benefit 
>that testing (and integration) is easier, but the drawback that other 
>thread modules still use the same 1 thread per operation model.  It's 
>easy to port the thread pooling code to the other thread models, and 
>I haven't looked at how easy it would be to push the thread-pooling 
>code "outside" of the platform-specific thread code.  Any comments 
>here?  Preferences?

I rather we implement the ldap_pvt_thread_pool* routines built
upon ldap_pvt_thread* routines and use this in slapd as needed.
One can implement pool management routines using mutexes and
conditions, per Nichols "Pthreads Programming" (O'reilly) or
Butenholf "Programming with POSIX Threads" (Addison-wesley) or ....

>3) Here's a neat test:  run a bunch of clients against four LDAP 
>servers that are identical except for one thing.  In one, use the ACL:
>    access to * by * write
>in another, use:
>    access to * by dn="cn=YourName, ou=People, dc=..." write

regex won't match any dn (due to extra spaces)

>and in another use:
>    access to * by dn=".*ou=People, dc=..." write

regex won't match any dn (due to extra spaces)

>and in the last use:
>    access to * by dn=".*ou=.*, dc=.*, ..." write

regex won't match any dn (due to extra spaces)

>Obviously, the DNs above should make sense for the setup.  The idea 
>in the last case is to have a "complex" regex.

But that complex regex is inefficient.  .* or .+ should
be avoid except at the end of a regex. 

>What you'll see is that performance on the last three cases is about 
>the same, while the first case is *much* faster, relatively speaking.

The first shortcut regex processing, as do "self", "anonymous",
"users", dn="", dn=".+", dn=*, and many varients.  A simple
DN could be shortcutted as well.

>I've tested some simple changes that special case the two middle 
>examples above to do a simple DN match and a DN suffix match, 
>respectively.  In this case, I've seen a factor of 10 improvement in 
>performance (obviously, the improvement depends on what your ACLs 
>look like).

I wouldn't mind special casing a simple DN match (ie: a
regex that is text only).... but not beyond this.

>My implementation is sort of a hack, to make it easier to integrate 
>the changes into the official source.  If nobody protests, I will add 
>this special case code to the official source, but will clean it up a 
>bit for better maintainability.

I don't think your 3 or 4 cases should be shortcutted.

A note: we should be a bit careful of over optimizing DN
processing as we will likely have to rework DNs later to
fully implement RFC 2253.