[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: (ITS#4943) tpool.c pause vs. finish



OK, here's where we stand now, with my TODOs, questions and some
new issues.  3+ pages.  I thought this would be a brief message:-(
Hopefully I'll get to some of it this weekend, but no promises.


TODOs (unless someone protests) and questions:

* When a thread is finishing, make it go active for context_reset().
  Should make slapi module destructors safer, and I don't see that it
  can hurt.

* I'd like to rename thread contexts to tasks(?) and user contexts back
  to contexts (as in RE23).  The current terminology is confusing.

* Save and restore the old data for key slap_sasl_bind, and just in case
  (I can't tell if it is needed) smbk5pwd.

  Might as well do it everywhere we have setkey - do something - delete
  key in one function call.  Since setkey(non-null) isn't supposed to
  overwrite existing keys, the change should not be able to hurt.  It
  could protect something I didn't think of or a future slapd change.

  API: the smallest change would be to call pool_getkey() first,
  but I prefer a pool_swapkey(ctx, key, &data, &kfree) function.
  Then in RE24, maybe remove the &kfree parameter to pool_getkey().

  Could omit the &kfree parameter to swapkey() if new data was
  guaranteed non-NULL, but that'd require a malloc in bdb (silly says
  Howard) or passing data in a &union{integer, ptr} (cumbersome say I).

* Axe semaphore code in tpool.c, the ldap_lazy_sem_init() call in
  slapd/daemon.c, and with them LDAP_PVT_THREAD_POOL_SEM_LOAD_CONTROL
  in ldap_pvt_thread.h and SLAP_SEM_LOAD_CONTROL in slap.h.

* Is the ldap_pvt_thread_pool_context_reset() call in slapd/init.c
  indeed safe - ie. its slap_sl_mem_destroy() call?  That's not during
  startup, but in slap_destroy().  Slapd does various stuff after that.

* API changes:

  Do we include the assert(no multiple pools) in the import to RE23?

  Remove keys from ltu_key[] _before_ calling ltk_free() with them?
  That prevents functions called from ltk_free from seeing freed thread
  data, but it can also break ltk_free functions that call such
  functions _before_ freeing it.  Probably not for RE23.

* Scheduling: If several threads call pool_pause(), then once there is a
  pause tpool does not schedule them all.  They could get handled then,
  or another thread could undo the pause so tpool would wait to pause
  again.  Is that deliberate?


Dubious code, new issues, old ones I'm not sure we finished with:

* thread_keys[] still has problems.

  - pool_context() breaks if several ldap_pvt_thread_t bit patterns can
    represent the same thread ID: TID_HASH() would give different hashes
    for the same thread, and pool_context() stops searching when it hits
    a slot with ctx == NULL.  (My previous bug report was a bit wrong.)

    The best fix would be to use use real thread-specific data instead.
    Just one key with the ctx for now, that minimizes the changes.  OTOH
    it also means we'll do thread-specific key lookup twice - first in
    pthread to get the ctx, and then in ltu_key[] to find our data.
    Anyway, I said I'd do that later but might as well get on with it,
    at least for pthreads.  Except I don't know if that's OS-dependent
    enough that it should wait for RE24?

    When that's not implemented, I'd prefer to cast the thread ID to an
    integer when that is safe, and hash that instead of its bytes.
    Won't work for structs/unions, and is not formally guaranteed for
    pointers or in case of pthreads even integers.  BTW, when it doesn't
    work, we've got a problem in ITS#4983 (tls.c id_callback) as well.

    When that's not possible, we could search the entire table before
    giving up.  Could make thread_keys[] smaller and reduce the maximum
    allowed #threads on such hosts.  (What should the limit be?)

  - thread_keys is a poor hash table implementation with a poor hash
    function.

    For the hash function, I'll insert a variant of Paul Hsieh's
    SuperFastHash() instead, or something simpler if we hash an integer.

    For the hash table, the real fix would be to grab a proper hash
    table implementation and use that.  We might find use for it
    elsewhere too.  Though simple fixes will handle all but some rare
    cases, and I don't know how much we care:

    The table is never resized and will slow down lookup when it gets
    full.  A quick fix would be to allow max (70% table size) threads or
    something.  And double the size of the table, if we still want to
    allow 1024 threads.

    However if the admin experiments with ltp_max_count for a while so
    many threads stop/start/stop/start, the table slowly fills up with
    DELETED_THREAD_CTX entries.  Longer search chains, longer thread ID
    lookup time.  Should happen rarely, but we don't advise against it
    in the docs either.  Do we care, beyond documenting it?  If yes, we
    need a real hash table implementation, or at least to re-hash the
    table when there are too many DELETED_THREAD_CTX entries.  Or use a
    chained hash table.  No malloc needed, use the thread's stack space.

    For open addressing, I was thinking quadric probing would be better
    than linear, but maybe not with a proper hash function.  I don't
    know what that does to cache issues.

    Oops - for caching, maybe I shouldn't have removed the thread id
    from thread_keys[].  Or maybe the hash value should be stored there
    instead, so we can compare with == instead of ldap_pvt_thread_equal.

* I wonder if there are long-lived ldap_pvt_thread_cond_wait(),
  [r]mutex_lock() or rdwr_[rw]lock() calls which should make the thread
  go inactive first (after waiting for any pause to finish), so that
  (a) a pool_pause() will not block waiting for them, and
  (b) all threads in the pool can't be blocked in long waits without
      tpool knowing about it.

* back-bdb/tools.c uses pool_submit() to start long-lived threads:
  bdb_tool_trickle_task() and bdb_tool_index_task().  They don't exit
  until slapd is stopping, so they'll block slapd if a thread requests
  a pause.  Also the pool thinks its has more available threads that it
  does, so scheduled tasks might never be executed.

  Possibly that's not a problem with slaptools.  As complex as slapd is
  now with overlays and plugins and whatnot, I have no idea.

  If it's a problem, it is not enough for them to regularly reschedule
  themselves and exit: They have long-lived waits.  See previous issue.

* slapd/bconfig.c maintains its own idea of the max number of threads
  in connection_pool_max and slap_tool_thread_max, which does not match
  tpool's idea if the limit is <=0 or larger than LDAP_MAXTHR.

  Since slapd acts on its own count, I imagine that can cause breakage.
  Simplest fix might be to set max #threads and then read it back,
  I haven't checked.  Would need to return LDAP_MAXTHR instead of 0
  for "as many as allowed".  Note however also the issue above:
  tpool's idea of actually available threads can be too high too.

* I said my new ltp_pause waits might be too aggressive.  Maybe I should
  clarify: I don't see any instances where they can be dropped, but it's
  possible that slapd expects pauses to be less aggressive somewhere.


About pool_submit() and the ltp_mutex unlock-relock:

  Some history and notes, just to finish up previous discussion.
  Some possible TODOs, but they are not needed since it's not broken:

  Originally, pool_submit() unlocked ltp_mutex before thread_create()
  and re-locked it afterwards for further maintenance.  That was not
  because of the semaphore code.  It looks like it could work even in
  non-threaded mode where thread_create(,,fn,arg) just called fn(arg),
  though tpool.c was always wrapped in an #ifndef <--without-threads>.

  I don't think non-threaded would have worked anyway since rev 1.24
  (apr 2003), which introduced thread_keys[].  Caller and callee would
  have gotten the same thread ID, so tpool would have confused their
  contexts.  Previously the new context was pushed onto ltp_active_list,
  and would have temporarily hid the old context.

  That's also when thread_create() was moved after the ltp_mutex
  re-lock, which would mean the mutex was locked while pool_wrapper()
  was running and tried to re-lock the mutex.  And now, my rev.171 has
  removed the unlock-relock.

  With that I can also remove the search for ctx in ltp_pending_list
  when backing out of pool_submit, it will be at the head of that list.

  With tpool only supporting threaded mode, another cleanup could be for
  pool_submit to start a thread before inserting ctx in ltp_pending_list
  instead of after.  That done, it can fail if ltp_open_count is still
  0, and we won't need the code for backing out of pool_submit().

  Oh, the comment /* no open threads at all?!? */:  That's not strange.
  Happens in the first pool_submit() call(s) if thread_create() fails.
  Though in that case there was no danger of removing the wrong ctx
  from ltp_pending_list after all (fixed in rev 1.63).

-- 
Regards,
Hallvard