[OT] This feels wrong (pthreads question)

Levi Pearson levi at cold.org
Sun Jan 28 20:01:44 MST 2007


On Jan 28, 2007, at 2:22 PM, Steve wrote:

> Hi everyone,
> As a coding excersize just to "see if I could do it" I decided to make
> a chat server using UDP.

Up to this point, you're doing pretty good!  Coding things just to  
see if you can do it is excellent practice and lots of fun, to boot.

> A major part of my design is the ability to scale up without slowing
> down much, as such I decided to break my server design into 3 major
> component objects.
> Listener, Sender, Core.

This isn't too bad of a goal, but it is often wise to make a fairly  
simple prototype that performs the core features as simply as  
possible and see if it works well enough for you.  If it slows down  
too much as you perform scale testing, you can see exactly why it  
happens and know precisely what needs to change.

>
> The listener is pretty simple we just create a non blocking listener
> on a port and poll it periodically.
>

Now we're really off in the weeds.  Periodic polling of a non- 
blocking port is almost never what you want; at least not polling by  
hand.  If the listener is in its own thread,  just block on your read  
call.  If you need to do other things in the thread while waiting for  
input, there's always the select() or poll() system calls, which will  
block until input arrives on any of the file descriptors you tell  
them to watch or a timeout of your choice occurs.

In fact, by building your application out of an event dispatch loop  
centered on a select() call, you can avoid dealing with pthreads  
altogether.  As far as I'm concerned, users of C and C++ should avoid  
threading as often as it is feasible to, because threading introduces  
nondeterminism to your code and opens the door to all sorts of hard- 
to-find errors, many of which won't appear until you really heavily  
load the application.

> The Core server design handles processing of information coming in
> from the listener, i.e. reading the buffer, and creating new sender
> objects if the client has never been seen before, as well as cleaning
> up sender objects if the client has gone too long without a response.
>
> The Sender(s) are where I'm having difficulty here, but it seems to me
> this shouldn't be so hard.  Basically a sender is a self contained
> "machine", it needs its own thread because it runs in an infinite loop
> checking the main chat buffer in the Core, if anything has changed it
> sends those changes to the client, and then sleeps for 250ms.
>

Let me get this straight here.  You want to write a scalable  
application, and you are assigning a thread to each client?  Those  
are seriously conflicting design features.  Each new thread (assuming  
you're using Linux) allocates a new process structure in the kernel  
that has a new chunk of memory for stack space allocated to it and a  
pointer to the same heap as the process it belongs to.  This is not a  
particularly cheap data structure when compared to non-threaded  
alternatives.  Start getting into the hundreds or thousands of  
concurrent connections, or get a DOS attack of hundreds of thousands  
of incoming 'new users', and your server will fall right over.

> Now I know in a typical implementation, that all clients are contained
> in a list and when the buffer has changed then the server iterates
> through all the clients and sends out the changes.   But I don't
> really like that design, the whole point of my design is to do it
> without iterating through a list.
>

That implementation is typical because 1) it is easy, and 2) it is  
efficient.  What's not to like about it?  If you want to dress it up,  
call it the Listener Pattern and create the appropriate objects.

> So as I was saying basically the sender class has a public method
> called "void run()", this method is the function that wakes up, checks
> the buffer, sends if needed and then goes back to sleep.
>

... [ pthread and C++ stuff snipped ] ...
>
> And it works, but it feels very wrong to me.  Having to cast the
> object to void, then recast back to it's original form, seems like a
> lot of overhead as well as being dangerous.  And it has to occur every
> 250 ms, which seems like alot of recasting to me.
>

Well, it feels pretty wrong to me, too, but just about every  
combination of C/C++ and pthreads feels wrong to me.  Casting to and  
from void isn't particularly dangerous if you always know exactly  
what you're casting, and it certainly doesn't add any overhead.  It  
looks ugly, but considering the lousy type system C has, it's  
sometimes necessary.  It's just subverting the type system, after  
all, not actually *doing* anything.  What's got a lot of overhead is  
waking up every 250ms (causing context switching and interrupting  
something else) whether there's any reason to or not.  Slave threads  
like that really should stay blocked waiting for an event, not polling.

>
> Thoughts?  Ideas? Concerns?
> Thanks in advance!

Well, it's cool that you're trying to build a scalable system as a  
learning project, and I think this is a reasonable sort of project to  
start with.  I think you're getting a bit ahead of yourself, though,  
and that you ought to take a couple of steps back and start with  
something simpler.  If you *really* want to use threads, I would  
suggest reading about them in a bit more depth before trying another  
design, because your current one is fundamentally broken.  If you  
just want to build a scalable system, I suggest avoiding threads  
altogether and building on top of a select()-based event loop.

			--Levi



More information about the PLUG mailing list