Cluster computing with Linux & Beowulf

Joshua Marsh joshua at themarshians.com
Mon Feb 24 09:43:29 MST 2014


On Sat, Feb 22, 2014 at 2:42 AM, Dan Egli <ddavidegli at gmail.com> wrote:
>
> Program X is intended to handle quite literally up to hundreds of millions
> of simultaneous TCP connections across a number of network interfaces (the
> exact number is unimportant).



Like others have said, this isn't necessarily a traditional Beowulf cluster
use case. You are only an order of magnitude off Google's scale at this
point. The implementation would be similar though. With this number of TCP
connections you can't expect a single cluster to solve this problem. It
usually involves distributing the load over multiple data centers. You'd
use higher level tools like DNS to route traffic based on location, load,
or availability. Your data centers would then communicate with each other
to service requests. It would normally involve some form of data
replication.



> So as program X runs, the overall load begins
> to rise as X has to do more and more work for each of the connections. That
> much is a given. Take, for instance, an MMORPG. The more people login to
> the game and interact with the world, the more work is involved for the
> actual server. Same idea here.


I suppose it depends on what sort of interactions your customers have with
the system, but if the state changes are akin to an MMORPG, UDP will
probably be the preferred protocol.



> My thought was that if X was built to
> support clustering and it was run on a Beowulf cluster, that would slow the
> initial growth of the load. Then, supposing that the dynamic addition is
> possible, I can watch the system and if the system load average on the
> cluster gets too large (due to overwhelming the CPU, not the internet
> pipe), someone  could reduce without the end user noticing anything simply
> by adding new machines. Is this right? Or do I have a major flaw in my
> thinking?
>
>
Like others have said, their usefulness has become extremely limited. Where
they are still used, they are priceless. In most cases though, using tools
already available to handle these issues are probably your best best. AWS
can do most of this for you. Google just release their Computer Engine to
the public and it can do many of the things you want as well.


More information about the PLUG mailing list