plug at ryansimpkins.com
Thu Oct 2 16:42:28 MDT 2008
On Tue, September 30, 2008 16:43, Andrew Jorgensen wrote:
> Okay folks, I'm going out on a limb and admitting to some ignorance
> here. Suppose I have a high load average on a server, let's say 20,
> how do I tell what's really going on? I understand that load means
> that there are processes waiting for some resource but how do I see
> what resources they are waiting for? We don't want to go buy more RAM
> and then find out that we had plenty of RAM, for instance.
First, understand how loadavg is calculated:
"The load average numbers give the number of jobs in the run queue (state R)
or waiting for disk I/O (state D) averaged over 1, 5, and 15 minutes."
In a nutshell, the problem is usually related to CPU or IO problems.
First, open up 'top' and look for a large number of jobs eating up CPU. That
can indicated the issue.
If there aren't 20+ jobs all trying to eat up the CPU (as is often the case)
then you are having IO issues. Keep in mind that IO issues can be caused by
LOTS of things. Here is what you ask:
1) Do I have any network attached storage? If so, are there a lot of processes
trying to access these shares?
2) Is my disk subsystem behaving correctly (use 'iostat -x 10' to find out)?
3) Do I have enough iops performance in my disk sub system to satisfy demand?
A good way to test this is using iostat, and also looking at the iowait CPU
Depending on your answers, the solutions vary.
More information about the PLUG