[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: prioritize operations; stop & defer running threads?



Pierangelo Masarati wrote:
I'm facing a problem, where drawbacks of heterogeneous heavy load on a
server could be alleviated, after inspecting the content, by determining
what requests can be served at a lower priority and what must have the
highest possible priority.  Here "priority" might not be the right term;
the point is that some of the operations, although not being that
important, might (and will) take long to complete, while others would
complete immediately, and are very important.  However, many "slow"
operations are likely to saturate the available threads, so "fast"
important operations would have to wait until a thread is free.  Whether
an operation is going to be slow/not important or fast/important is
determined by an overlay that first processes the request, but anyway
well past the frontend processing.

I was thinking of introducing the possibility to stop a running thread
and re-queue its request, so that operations whose priority is
considered lower don't eat up more than some fraction of the available
threads, and the system remains responsive for the fast operations.
This could be done, e.g., by saving the ber of what's been read and the
o_time, and later on restart execution by re-decoding it and so; smarter
ways to resume from exactly where it was suspended could be
investigated.

Whenever the number of threads that are busy with a low priority process
reduces below a threshold, a pending operation could be resumed.
If the number of queued slow operations grows past some limit, further
slow operations could be rejected with LDAP_BUSY, instead of pending
forever.

In our case, just increasing the number of threads is not an option,
because the flow of requests is such that slow ones would eventually
saturate slapd no matter how many threads are available. It may look
like the system is not correctly sized (and it might be) but this is
totally out of our control. Actually, the system is distributed; the
local portion is sized correctly, but there are remote portions that are
connected by slow, poor bandwidth connections, which bottleneck some
distributed operations.
Suspending in the overlay before any other processing begins shouldn't be too hard. Suspending a long-running search and resuming it would be tricky, but I suppose paged-results already addresses that.

Your overlay will have to maintain its own high/low priority queues, and keep track of operations that get deferred multiple times (prevent starvation). Rather than pushing operations back onto the frontend connection's op queue you might maintain them totally privately in your overlay, with an Abandon handler to remove any that get abandoned, etc.Your overlay's response handler can then move ops off your priority queues back to the connection handler when you decide to run them.

Or just push deferred ops to the tail of the connection's op queue, that may be simpler in general.

--
 -- Howard Chu
 Chief Architect, Symas Corp.  http://www.symas.com
 Director, Highland Sun        http://highlandsun.com/hyc
 OpenLDAP Core Team            http://www.openldap.org/project/