[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: SLAP_LIGHTWEIGHT_LISTENER, using lazy_sem is a bad design

I think the connection manager needs more work.
What I would prefer is a very lightweight listener, no I/O.
There are a number of ways to achieve this, here's one:

For each selected fd, the listener would clear the fd
and queue a worker, passing it a set of flags indicating
what events to process (read, write, exception).  The worker
then performs the necessary reads or writes, notifies
listener when done (so it can again listen), and does
whatever work.

For a connection-accept worker, the listener can be
notified immediately after the accept.

For a connection worker, it would first do writes and
if it reaches a would-block, notify the listener of
that.  Then do a read of, at most, one operation,
then notify the listener that that the fd is available
for read, then do the operation.

The current code is close to this, but is not quite


At 09:48 PM 10/13/2005, Howard Chu wrote:
>Kurt D. Zeilenga wrote:
>>As I noted in my prior comments, I think the writes need
>>to be moved out of the listener thread.  The easiest
>>way to do that would be to the worker thread just
>>block.  This, of course, can lead to all workers getting
>>blocked, preventing work on other operations.
>Not necessarily, or not any worse than now. We can still mark the connection as "writer active" and test for that in the dispatcher, the way we test for c_writewaiters now, and add the op to the pending queue. That is, the writer increments c_writewaiters before it actually writes, and decrements it when it completes the write. If c_writewaiters is still non-zero at that time, it can do a cond_signal() to get the next waiting writer started. There is no need to involve yet another thread to manage this.
>The thread pool init function should take another argument: a callback to invoke when the thread pool frees up again. In connection.c, this callback should just set a flag "thread_pool_available" and then do a WAKE_LISTENER(). When the listener thread sees that, it can re-add all the readfd's to the event set and resume normal operation. You can still implement a counter/hysteresis like in the lazy-semaphore approach, just by waiting to invoke the callback.
>This is another case where epoll() is vastly inferior to select(), as manipulating entire descriptor sets is very expensive. Perhaps we should just use two different event sets, one for normal use and one for throttled use. Although, if we eliminate the writefd's, then we can just sit in a blocking read on wake_sds[0] when we're throttling.
>>At 08:50 PM 10/13/2005, Howard Chu wrote:
>>>This solution will deadlock if all worker threads are stuck in a write wait. Since the semaphore completely blocks the listener thread, there will be no way to wake up the waiting writers and free up more threads. Also, blocking the listener thread like this prevents the idletimeout checker from working. I.e., you have managed to disable two key mechanisms for returning server resources to the pool, precisely when they are needed the most.
>>>The listener thread must never block, period.
>>>It would be better to simply have ldap_pvt_thread_pool_submit return a result code (e.g. LDAP_BUSY if the submitted op will be queued because there are no available workers, LDAP_SUCCESS otherwise) that is passed back to the listener thread. When the listener thread gets this result it should drop all read descriptors from the event set, but keep monitoring the wake_sds and the write events.
>>>Alternatively we should stop monitoring write events and just let the writers unblock themselves. Is there a particular reason why we monitor write events? I don't see any benefit. Eliminating one set of event sources would reduce our kernel load by half.
>>>Aside from this issue there is definitely a bug in the current implementation; I see the same event being submitted multiple times in rapid succession. The CPU usage goes to 100% and there does not appear to be any end condition that disables the event. This occurs most often in test033, but should occur in any test that uses syncrepl (or listener-managed client tasks like syncrepl). After the syncrepl task has sent a search request to the provider and the first reply arrives, marking the socket readable. It appears this readable state is not getting reset.
> -- Howard Chu
> Chief Architect, Symas Corp.  http://www.symas.com
> Director, Highland Sun        http://highlandsun.com/hyc
> OpenLDAP Core Team            http://www.openldap.org/project/