[OT] This feels wrong (pthreads question)
levi at cold.org
Sun Jan 28 20:01:44 MST 2007
On Jan 28, 2007, at 2:22 PM, Steve wrote:
> Hi everyone,
> As a coding excersize just to "see if I could do it" I decided to make
> a chat server using UDP.
Up to this point, you're doing pretty good! Coding things just to
see if you can do it is excellent practice and lots of fun, to boot.
> A major part of my design is the ability to scale up without slowing
> down much, as such I decided to break my server design into 3 major
> component objects.
> Listener, Sender, Core.
This isn't too bad of a goal, but it is often wise to make a fairly
simple prototype that performs the core features as simply as
possible and see if it works well enough for you. If it slows down
too much as you perform scale testing, you can see exactly why it
happens and know precisely what needs to change.
> The listener is pretty simple we just create a non blocking listener
> on a port and poll it periodically.
Now we're really off in the weeds. Periodic polling of a non-
blocking port is almost never what you want; at least not polling by
hand. If the listener is in its own thread, just block on your read
call. If you need to do other things in the thread while waiting for
input, there's always the select() or poll() system calls, which will
block until input arrives on any of the file descriptors you tell
them to watch or a timeout of your choice occurs.
In fact, by building your application out of an event dispatch loop
centered on a select() call, you can avoid dealing with pthreads
altogether. As far as I'm concerned, users of C and C++ should avoid
threading as often as it is feasible to, because threading introduces
nondeterminism to your code and opens the door to all sorts of hard-
to-find errors, many of which won't appear until you really heavily
load the application.
> The Core server design handles processing of information coming in
> from the listener, i.e. reading the buffer, and creating new sender
> objects if the client has never been seen before, as well as cleaning
> up sender objects if the client has gone too long without a response.
> The Sender(s) are where I'm having difficulty here, but it seems to me
> this shouldn't be so hard. Basically a sender is a self contained
> "machine", it needs its own thread because it runs in an infinite loop
> checking the main chat buffer in the Core, if anything has changed it
> sends those changes to the client, and then sleeps for 250ms.
Let me get this straight here. You want to write a scalable
application, and you are assigning a thread to each client? Those
are seriously conflicting design features. Each new thread (assuming
you're using Linux) allocates a new process structure in the kernel
that has a new chunk of memory for stack space allocated to it and a
pointer to the same heap as the process it belongs to. This is not a
particularly cheap data structure when compared to non-threaded
alternatives. Start getting into the hundreds or thousands of
concurrent connections, or get a DOS attack of hundreds of thousands
of incoming 'new users', and your server will fall right over.
> Now I know in a typical implementation, that all clients are contained
> in a list and when the buffer has changed then the server iterates
> through all the clients and sends out the changes. But I don't
> really like that design, the whole point of my design is to do it
> without iterating through a list.
That implementation is typical because 1) it is easy, and 2) it is
efficient. What's not to like about it? If you want to dress it up,
call it the Listener Pattern and create the appropriate objects.
> So as I was saying basically the sender class has a public method
> called "void run()", this method is the function that wakes up, checks
> the buffer, sends if needed and then goes back to sleep.
... [ pthread and C++ stuff snipped ] ...
> And it works, but it feels very wrong to me. Having to cast the
> object to void, then recast back to it's original form, seems like a
> lot of overhead as well as being dangerous. And it has to occur every
> 250 ms, which seems like alot of recasting to me.
Well, it feels pretty wrong to me, too, but just about every
combination of C/C++ and pthreads feels wrong to me. Casting to and
from void isn't particularly dangerous if you always know exactly
what you're casting, and it certainly doesn't add any overhead. It
looks ugly, but considering the lousy type system C has, it's
sometimes necessary. It's just subverting the type system, after
all, not actually *doing* anything. What's got a lot of overhead is
waking up every 250ms (causing context switching and interrupting
something else) whether there's any reason to or not. Slave threads
like that really should stay blocked waiting for an event, not polling.
> Thoughts? Ideas? Concerns?
> Thanks in advance!
Well, it's cool that you're trying to build a scalable system as a
learning project, and I think this is a reasonable sort of project to
start with. I think you're getting a bit ahead of yourself, though,
and that you ought to take a couple of steps back and start with
something simpler. If you *really* want to use threads, I would
suggest reading about them in a bit more depth before trying another
design, because your current one is fundamentally broken. If you
just want to build a scalable system, I suggest avoiding threads
altogether and building on top of a select()-based event loop.
More information about the PLUG