[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: Some openldap fixes...



At 08:44 PM 9/17/00 +0200, Marijn Meijles wrote:
>We'll now summarize the changes and fixes:
>
>- Removal of flush after key removal. This is WAY faster and not dangerous.

sync'ing needs work.  We have a couple of suggestions in this
area and need to determine best course of action.   Also, note,
that I'm working on a replacement backend... so I caution against
massive rework of back-ldbm.  [I soon will post a list of things
folks can do help implement the replacement backend.... (actually,
the biggest thing would be to pick up "other tasks")].

>- Because flushing after every key insert was too slow and no write syncing
>resulted in data loss, we introduced 'lazy syncing', which only syncs all
>databases if the last sync was more than a second ago.

See above.

>- Add some entries to the connection array. select sometimes gives back fd
>1024, which leads to a crash because the array goes from 0..1023.

select? or accept()?  2.0 includes code to protect overrun of
dtblsize sized arrays (including fd_sets).


>- Impose an upper limit on cache entries. We'd rather have a lot of 
>small entries in the cache than a few big ones.

>- Reduce priority of the select thread. Otherwise the system will suffer
>from thread saturation. The whole connection part is terribly inefficient
>anyway when used with threads and will be replaced shortly.

2.0 uses thread pools to reduce thread overhead and to avoid
saturation.  Still needs work (to reduce amount of work done
in listener thread).

>- Added an usleep to cache_find_entry_id in order to avoid excessive cache
>mutex locking and run away threads. Right now it's an ugly fix, but the
>idea is ok.
>
>- Added an interface option so you can specify the interface you want to
>bind.

Provided in 2.0... (and, IIRC, there's a contributed patch for
1.2 in the Issue Tracking System).

>- The pagesize for id2entry is now configurable. If you put larger items
>in the ldap db, you'll want to raise this value, but not necessarily
>the pagesize for all the db's.

Knobs are nice... the new backend will be highly configurable...
(if only I had more time to work on it).

>- Not included in the patch, but well worth mentioning is the fact that
>you can put extra yieldpoints into the BDB2 code when using a cooperative
>mt package. This will greatly enhance responsiveness. If you want more 
>info on this, please ask.


>Peter made the rest of the descriptions:
>
>- removed NEXTID file
>        * instead use the value of the last key in the id2entry db

2.0 maintains the next id in a DB file.

>- removed the explicit dn index
>        * couldn't find any use, and the program kept working just fine :)

1.2 (and 2.0) uses DN indexing to properly scope searches.  Removing
them have side effects.  2.0 has new DN indexing to improve speed
(1.2 used substrings indexing which were horible, so disabled by
default [which meant that you cannot place multiple suffixes in
one database by defaults).  

>- fixed search scopes
>* modified/removed filter alteration in (onelevel|subtree)_candidates
>          scope enforcement is done ldbm_back_search anyway.

DN indices don't enforce scope, they just limit the number of
candidates you must test.  1.2 indices, especially for subtree
scoping, were not terrible effective (and is turned off by default).
2.0 sports new DN indexing (and is always on).

>        * fixed that butt-ugly for-loop in ldbm_back_search

That's not terribly specific.  The only for loop in ldbm_back_search
looks fairly reasonable (at least in HEAD, but I don't recall 1.2
being much different).

>        * replaced all calls to idl_allids with give_children
>
>        id2children.c::ID_BLOCK * give_children(        Beckend *, 
>                                                                                                Entry * base, 
>                                                                                                int scope) 
>
>          users the id2children db to construct a list of all id's within the
>        specified base/scope pair.

id2children was replaced additional DN indices in 2.0.  The
new code supports indices for scope base, one-level, and subtree.


>- fixed scopelessness of indices:

attribute assertion indices should be orthogonal to scope.

>TODO:
>
>- remove hard limit on idl block splits, this is VERY annoying when
>the db gets a bit bigger ;(

The IDL code needs work.  There are a number of optimizations
needing to get done such as removal of for loop copies (mostly
done) and use of qsort()/bsearch() for large blocks.

I plan to ditch the IDL code in my replacement backend (if duplicate
keys works well enough).

>- rewrite cache in order to avoid one mutex for the whole cache

I think cache redesign is best left for the replacement backend.
The new cache will be managed by the database.

>- rewrite str2entry and entry2str so that they use a number of memory
>slots. this is to avoid the mutex locking they employ now.

We have a patch submitted against 2.0 to do this... it needs a
little work.

Kurt