[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4943) tpool.c pause vs. finish



h.b.furuseth@usit.uio.no wrote:
> OK, I've fixed some of it in HEAD.  See comments in the commits for
> tpool.c 1.63 and outwards.

Quite a lot of changes, will take some time to evaluate the overall impact.
> 
> RE23 has the same problems.  Nearly identical code sans the new
> semaphore stuff and a few variable names, so fixes should be easy
> to import when they've been tested a bit.

The semaphore stuff probably needs to be axed/unifdef'd. It is unusable, 
it would only result in deadlocks, and I think I already removed the 
slapd code that would have invoked it.

> Remaining problems:
> 
> tpool.c breaks if there are multiple pools.  The simplest
> solution seems to be to remove support for multiple pools.
> The problem is thread_keys[].  It is shared between pools, but:

slapd only uses a single pool, and that's really the only use case we 
care about. In fact it makes no sense to support multiple pools, because 
we have no mechanism to assert scheduling priorities of one pool over 
another.

> Other thread_keys[] strangeness:
> 
> - setkey does not free the old data with ltk_free.  Possibly that's
>   intentional, but if so that requires weird code elsewhere.

Yes, that's intentional. ltk_free is only intended for cleanup if a key 
is still set by the time a thread exits. All of the keys we set are live 
for the entire life of a thread, they don't get re-assigned.

> - maybe related to how getkey can return the ltk_free function - which
>   seems crazy since a caller who uses it must then explicitly reset the
>   key afterwards (or better, before), to be sure purgekey doesn't
>   see the data and free it again.

Nothing actually depends on returning the ltk_free function, that was 
just included for completeness' sake.

> Other notes:
> 
> - I've inserted new, possibly too aggressive, waits for ltp_pause in
>   pool_wrapper.  See the tpool.c 1.65 comment.  Yet another alternative
>   than splitting the ltp_pause state might be to remove the requirement
>   that purgekey() can only be used by a paused thread.  (Could loop
>   until the key is _really_ gone instead, or something.)

Will look at this later.
> 
> - tpool.c uses LDAP_FREE(), can call ldap_pvt_thread_pool_context(),
>   which uses thread_keys[].  If one thread does that when another thread
>   has paused the pool, that can break since thread_keys[] is reserved
>   for ldap_pvt_thread_pool_purgekey() during pauses.

Since there is only one thread pool in use, it is impossible for another 
thread to be active.

> - ltp_pending_count includes threads sleeping in pool_pause() - should
>   it?  In effect, pool_pause() temporarily reduces ltp_max_pending.

I don't see why any special consideration is needed here. Nothing else 
in the thread pool is going to move while a pause is in effect.

> - In ldap_pvt_thread_pool_submit():
> 
>   This comment:
> 			/* there is another open thread, so this
> 			 * context will be handled eventually.
> 			 * continue on and signal that the context
> 			 * is waiting.
> 			 */
>   is wrong if the previous if (pool->ltp_open_count == 0) was
>   taken but the no thread was found inside that.

The comment may be incomplete...

>   Also if ctx is invalid (another thread handled it) and a new
>   pending request has gotten the now-unused ctx, then the
>   'if (pool->ltp_open_count == 0)' code will free that new
>   pending request instead of the old one.

Most likely this is due to unlocking and relocking the mutex around the 
semaphore code. Axe that, remove the gratuitous unlock/relock, and then 
that problem goes away.

> - Note, I haven't looked at the new semaphore stuff yet.
> 
> Will get back to this later, but could use review so far.  In particular
> I'd like to know if we'll keep or kill multi-pool support.

Thus far we've only had to worry about a single pool. If anyone can come 
up with a reason to need multiple pools, I guess we need to hear it. 
Given the lack of scheduling control, I don't see how it would be of any 
benefit. At the same time, I don't see a compelling reason to perturb 
the code at the moment.
-- 
   -- Howard Chu
   Chief Architect, Symas Corp.  http://www.symas.com
   Director, Highland Sun        http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP     http://www.openldap.org/project/