Cluster computing with Linux & Beowulf
ddavidegli at gmail.com
Sat Feb 22 02:42:11 MST 2014
I've a couple questions on the Linux cluster computing software, Beowulf (I
think that's what it was called). Anyone know if there's a maximum number
of nodes that a cluster can support? And can the cluster master add new
nodes on the fly so to speak?
My understanding of cluster computing is that you can send a job to the
master server, and the master server will in turn break the job between all
the nodes in the cluster, then combine the results. I think it's not all
that dissimilar to multi-processor specific tasks. Is this correct?
Here's a scenario I could envision. Please tell me if there's a serious
flaw in the idea. I hope it explains what I'm thinking of.
Program X is intended to handle quite literally up to hundreds of millions
of simultaneous TCP connections across a number of network interfaces (the
exact number is unimportant). So as program X runs, the overall load begins
to rise as X has to do more and more work for each of the connections. That
much is a given. Take, for instance, an MMORPG. The more people login to
the game and interact with the world, the more work is involved for the
actual server. Same idea here. My thought was that if X was built to
support clustering and it was run on a Beowulf cluster, that would slow the
initial growth of the load. Then, supposing that the dynamic addition is
possible, I can watch the system and if the system load average on the
cluster gets too large (due to overwhelming the CPU, not the internet
pipe), someone could reduce without the end user noticing anything simply
by adding new machines. Is this right? Or do I have a major flaw in my
If Beowulf isn't the software I'm thinking of, what is? And what other
issues, other than perhaps memory usage, would be involved? And speaking of
memory, is there a way to distribute the memory at all? I know that there
are 8GB dimms available easily these days, and a high-end motherboard can
usually support 8 dimms. So that's 64GB of ram. But for hundreds of
millions of connections, I could see the possibility of exceeding memory
capacity being a potential issue. I suppose I could find bigger dimms (I've
heard rumors of 16GB and even 32GB dimms, but never seen them) but that's
going to seriously jack up the server costs if I do it that way. So if
there's a way to distribute the memory usage along with distributing the
CPU work, that would be of great assistance.
Thanks in advance, folks!
More information about the PLUG