[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: -DNO_THREADS, select(), signals



At 05:36 PM 7/21/99 -0600, Richard Krukar wrote:
>Is anyone familiar with how apache manages message
>queues?

I believe Apache uses a classic multiple process accept()
approach.  However, I believe they are considering
moving to a thread-based architecture.

The "classic" accept() approach is to fork() N processes
after binding the listener socket.  Then each daemon
independent select() upon the listener.  When an new
connection is made, all processes currently select()ing
trigger and they race to accept() (can you say crunch).
However, the operating system *should* ensure that only
one of the processes can actually accept() the connect.
This process adds the newly accepted socket to it's
fd_sets.  The other process get an error and return to
their select.  When input is received on the socket,
only the process which accepted the socket awakes to
handle the request.  At this point, you basically have
N independent servers.  Balancing the number of connections
handled by each can be tricky... balancing actual load
is extermely hard.

This might work okay with single request HTTP, but IMO
wouldn't work very well multiple request protocols such
as LDAP.  The long term nature of some LDAP sessions
makes it hard to balance sessions between processes.
Also, LDAP allows one session to have multiple outstanding
requests.  However, without some means for sharing
request queues, only the accepting process can execute
requests for a given session.  Without some means for
sharing the client descriptor, only the accepting process
can write responses to the client.

Though low level mechanism do exist to share such
information, they are very problematic and much less
portable than threads.

For the most part, threads work just fine if you don't
attempt to mix in oddities such whose behavior is not
well defined (such as select) or well implemented
(such as signal()).  In a pure threaded architecture,
you can completely avoid using select() and signal().

However, the main problem with pure threaded architecture
(as I was recently reminded) is scaling.  Many thread
implementations do not handle 1000s of threads well.

On thread pools...  I can actually implement (give time)
a thread pool API which supports -DNO_THREADS.  (Each
pool request is just immediately executed).

My primary reason for moving to an pure threaded environment
is remove dependencies upon select() and signal() which
have been causing lots of headaches.  

>	Anyway, I don't see why the -DNO_THREADS option
>can't be retained within a threaded I/O management system.

It's not possible to have multiple threads managing concurrent
I/O and support -DNO_THREADS.   To support concurrent I/O
with -DNO_THREADS you must place all I/O management into a
single thread and use conventional multiplexing mechanism
(ie: select) within that thread.  This is what we have today.

The problem with this architecture, with threads enabled,
the conventional mechanism might not be particularly thread
friendly.  In particular, the select() might block all
threads.   Then you need to coordinate fd_sets... which
means you need a reliable way of interrupting the select().
We use signal() today.  But signal() is terribly brain
damaged on many (threading) environments. 

Regradless of this discussion, I don't actually expect to
rework our our current connection managment code anytime
soon.  However, I believe a rework will eventually be
needed.  Having one thread managing I/O will be a
bottleneck long term.  

Kurt