[OT] This feels wrong (pthreads question)

Steve smorrey at gmail.com
Mon Jan 29 01:36:41 MST 2007


Ok I'm sorry this is getting really complicated.
Let me try and elaborate just a bit on the overall design so I can
clarify why I'm making the design decisions I am making.

It's a chat server yes, but it's not just a chat server alone.
In fact it's my second attempt at a MUD from scratch, I didn't explain
that sooner since I felt that it was irrelevant to the question I was
asking about pthreads.

Anyways most MUDs, or at least my last one, tend to look at the world
much like IRC, where you move from room to room, and everything in any
room is "in scope" of everything else in the room.  Therefore an
update to any object in the room is propegated to all clients actually
present in the room.  But this does not take into account physical
distance at all.
It is however a much simpler design since a client enters a room and
subscribes to the activity messages for the room, then leaves and
unsubscribes.

In the MUD I'm writing, there is only one "room" it's the entire game world.
I would like for a player standing in a doorway to be able to look in
and see the activity, or to walk by a wall where a battle is happening
on the other side, and hear that activity.

For that to happen I had to rethink the MUD design.
Instead of using a room subscription system, or rooms at all, I'm
using a type of scenegraph.  The MUD is actually one flat world sorted
by x,y,z.

Now under my new design, the "Sender" sleeps for a minimum of 250ms,
or a maximum of 1s, then wakes, and queries the scenegraph for n
meters in all directions from the player.
This is then compared to the scenegraph copy held by the Sender, the
differences if any are propegated to the client and the sender then
sleeps again.

If no changes have occured to the scenegraph since the last wake
cycle, the sleep  time is increased so we don't waste CPU time on
someone just kind of sitting there doing nothing.

Now as you can guess most of what I'm describing is alot more compute
intensive, than just a chat server, but it still boils down to a chat
server where instead of sorting discussions and actions using rooms,
we have "discussions and actions within earshot"

Sorry for the confusion, and again thanks for the help.

By the way, how much overhead does threading actually introduce,
especially in a situation where there may be hundreds or thousands of
threads in an application?

On 1/29/07, Levi Pearson <levi at cold.org> wrote:
> On Jan 28, 2007, at 9:58 PM, Steve wrote:
>
> > Well I appreciate the feed back on the overall design model, and I
> > agree, polling a listener instead of setting the listener to blocking,
> > and running it in it's own thread, is probably a bad idea, it's on my
> > list to fix here soon.
> >
> > The rest of the arguments against threading it this way don't
> > particularly apply as I see it, since the server app is supposed to be
> > running on a multicore or SMP setup.
> > I'ld really, like to maximize CPU utilization, as well as minimizing
> > the amount of bandwidth.
>
> The memory overhead of kernel threads applies no matter how many CPUs
> you have in your system.  Furthermore, internet chat is not a
> fundamentally compute-intensive problem.  There's nothing to compute
> at all until data arrives, and the only processing involved is to do
> some minimal interpretation of the incoming data and send it right
> back out again.  Worrying about multiple cores is serious premature
> optimization, and premature optimization almost never optimizes the
> real bottlenecks.
>
> If you continue in your quest to create a number of kernel threads
> that scales linearly with the number of active chat sessions, your
> architecture WILL NOT SCALE past a certain level, and that level will
> be far lower than if you used a single thread on a single cpu.
>
> If you absolutely MUST utilize multiple processors (and I sincerely
> doubt it would ever become necessary unless you completely blundered
> the design or somehow convinced a remarkable number of people to chat
> on your server) it would be more appropriate to have approximately
> the same number of kernel threads as CPU cores.  You could easily
> extend an event-driven select()-style server to this kind of
> architecture by having the events dispatched to worker threads, each
> of which would handle multiple communication channels.  You might
> also consider an IRC-like model, where you can scale through the use
> of multiple distinct servers.
>
> >
> > The traditional method of the server iterating through a large list of
> > clients just doesn't seem to me to be particularly efficient.
> > Especially if you have different clients interested in different data.
>
> Perhaps you're thinking of something different than what I thought
> you were.  I thought you meant that you would create lists of clients
> who were interested in certain kinds of messages, so that when one of
> those came in, that list would have the event dispatched to them.
>
> If you instead meant, as you seem to now, that there is a single
> global list of clients and you must iterate through them all to
> discover where to send messages... well, yes, that's not very
> optimal.  Don't do that, but don't do what you're planning on, either.
>
> Don't poll.  Register interested clients with the message dispatcher,
> so a message can be dispatched precisely where it is supposed to as
> soon as it comes in.  You should do it this way whether you use a lot
> of threads or not.
>
> > Or as in this case you pay once for a server upgrade, but continously
> > for bandwidth.  The 250ms sleep time is meant for times when lots of
> > activity needs to be reported to the client, the default could be much
> > longer.
>
> You're talking about these time intervals and hardware and such like
> you have an idea how your proposed system will actually perform vs.
> other architectures as it scales.  Unless you have done some sort of
> simulation, you really don't, and you're prematurely complicating
> things.
>
> >
> > But I want to thank everyone for the feedback on the pthreads question
> > which was the crux of my issue.  I believe I've found a more compact
> > way of doing what I was doing now.
> >
> > Basically we have the runObject function, but instead of sleeping and
> > calling runObject(data) again, we call a member function run() of the
> > object handed to runObject, and the run() function is self scheduling.
> >  So no more thunking after the first time :D
> >
> > Here's a question though, could runObject be handled as a template
> > instead?  Seems to me a template would allow it to run pretty much
> > anything handed to it, without the need to change the recast for each
> > object type.
> >
> > Thanks again for the advice!
> >
> > p.s. Is there a better threading library than pthreads?
>
> Pthreads is actually very good, for what it is, which is a fairly low-
> level standardized interface to basic multithreading primitives.
> It's just that multithreaded programming with only the basic
> multithreading primitives is very hard, and I wouldn't recommend that
> someone that doesn't have a lot of experience with it try to use it
> to build software that is supposed to be robust and scalable.
>
> There may be higher-level threading libraries and common practices
> for multithreaded C++ that alleviate some of the difficulties, but a C
> ++ expert would have to fill you in on those.  For my part, I
> recommend you stay away from threads or make do with a very small
> number of threads with very tightly controlled points of
> communication.  Such programs are usually faster, more reliable, and
> scale better anyway.
>
>                 --Levi
>
> /*
> PLUG: http://plug.org, #utah on irc.freenode.net
> Unsubscribe: http://plug.org/mailman/options/plug
> Don't fear the penguin.
> */
>



More information about the PLUG mailing list