[OT] This feels wrong (pthreads question)

Bryan Sant bryan.sant at gmail.com
Mon Jan 29 16:09:30 MST 2007


On 1/29/07, Levi Pearson <levi at cold.org> wrote:
> A MUD, on the other hand, has long-lasting persistent connections.  If
> there are 1000 users connected and each has a thread, there will be a
> lot of threads!  The threaded *style* of programming has its
> advantages (as well as a lot of disadvantages), but the

The single advantage of threads over other parallel processing options
is a shared heap.  If you don't need to share data frequently between
different lanes of execution, then don't bother with threads.  I think
you could make a valid performance case for wanting to share the
R-tree between multiple threads for this app.

> *implementation* that maps each thread to a kernel-level process
> structure with a unique data stack does not mix well with massively
> multithreaded applications due to the heavyweight nature of those
> threads.

I'm sure this happens a lot with novice thread developers, but an
experienced thread developer knows that A) thread creation is
heavyweight, and B) threads should be pooled and reused.  Allowing a
threaded app to spawn threads at will is a recipe for disappointment
in both performance and resource usage.

> For an example of what I'm talking about, see this page:
> http://www.sics.se/~joe/apachevsyaws.html in which apache 2 and yaws
> (a webserver written in Erlang, which implements its threads based on
> a select()-style event loop) face off against what amounts to a DOS
> attack.  You'll see that apache falls over and dies completely LONG
> before yaws does.  I would expect something like Lightthpd to perform
> similarly to yaws in a comparison like this, since it also uses an
> event-driven model.
>
>                 --Levi

That link is broken, but it's a straw man anyway.  I'd like to see the
apache config.  If apache was configured to use a thread *pool*,
there'd be no increased DOS risk (and throughput would be much
better).

It's ironic that threads are essentially an event loop handled by the
kernel's scheduler rather than your userland process.  So in both
cases, you're really just comparing two event loop models.  In the
case of threads there is the potential bonus of true parallel
execution and thus improved performance and CPU utilization, on the
down side you have potential context switches and a comparatively high
creation time.

If you create your thread pool up front and don't use many more
threads than you have CPU cores, then you'll see a performance champ
(good overall performance, but more importantly, good /throughput/).
If you use too many threads, then you may have context thrashing and
might actually see performance go down.

So remember, use a thread pool in your app like me and you'll a
superior product and increased sex appeal.

-Bryan



More information about the PLUG mailing list