UNIX Load Averages Explained

If you’ve spent much time working in a UNIX environment you’ve probably seen the load averages more than a few times.

load averages: 2.43, 2.96, 3.41

I have to admit that even in my sysadmin days I didn’t fully understand what these numbers were, but Zach did some digging a while ago to try to understand where these numbers are comming from.

In his blog entry from late last year, Zach sums it up quite nicely:

In short it is the average sum of the number of processes waiting in the run-queue plus the number currently executing over 1, 5, and 15 minute time periods.

The formula is a bit more complicated than that, but this serves well as a functional definition. Zach provides a bit more detail in his article and also points out Dr. Neil Gunther’s article on the topic which has as much depth on the topic as anyone could ever ask.

So what does this mean about your system?

Well, for a quick example let’s consider the output below. The load average of a system can typically be found by running top or uptime and users typically don’t need any special privileges for these commands.

load averages: 2.43, 2.96, 3.41

Here we see the one minute load average is 2.43, five minute is 2.96, and fifteen minute load average is 3.41.

Here are some conclusions we can draw from this.

  • On average, over the past one minute there have been 2.43 processes running or waiting for a resource
  • Overall the load is on a down-trend since the average number of processes running or waiting in the past minute (2.43) is lower than the average running or waiting over the past 5 minutes (2.96) and 15 minutes (3.41)
  • This system is busy, but we cannot conclude how busy solely from load averages.

It is important here to mention that the load average does not take into account the number of processes. Another critical detail is that processes could be waiting for any number of things including CPU, disk, or network.

So what we do know is that a system that has a load average significantly higher than the number of CPUs is probably pretty busy, or bogged down by some bottleneck. Conversely a system which has a load average significantly lower than the number of CPUs is probably doing just fine.

Easy Linux CommandsFor more tips like this check out my book Easy Linux Commands, only $19.95 from Rampant TechPress.

UNIX, systems administration, sysadmin, solaris, linux, load averages, system monitoring, sun, mac, osx

19 thoughts on “UNIX Load Averages Explained”

  1. Jon,

    Thanks for this one. It’s very timely from a personal point of view because I’ve spent the past couple of months looking at Run Queue stats for a paper I’ve been working on. Now I have some place else to send people when they ask why these numbers are significant.

    Cheers,

    Doug

  2. Okay, so I am looking at top and it tells me that my CPU is 75% user, 5% system, and 20% idle, and the load averages are all in the neighborhood of 25. I believe that this means that my system is not yet CPU bound. I believe that there are 25 processes which each are satisfied with about 3% of the CPU. This makes sense to me because we’re streaming content across the web and most of the time is spent waiting for bits to fly across the planet and back.

    Jeff

  3. Jeff,

    Needless to say there are a ton of factors that play into this, but from what you’ve mentioned here I would agree. Sounds like your CPU is fine (if it is consistently around 20% idle) and things are backing up waiting for network or other I/O.

    Disk statistics, network statistics, and memory statistics would fill out the picture, but the real question is, are users complaining?

    You may want to do the math on how many bits per second your pushing and how big the network pipe is you’re pushing through. If you’re already pushing 10 megabytes/second through 100 megabit ethernet you’re probably hitting the max on the network.

  4. Thanks for the explanation – I always wondered exactly what those numbers could tell me.

    I’m also happy to say that my Linux machines appear to be doing fine. 🙂

  5. hi all,

    i have one doubt with the uptime output. As per the load average it shows like 3.29. What should i have to understand with the “3.29”. Is it means 3.29 numbers of process?? if it is how could we mention the process number in a dotted format?? Pls be specific of the 3 and 29.

    Thanks in advance
    Aravind

  6. Well, it’s a start. Many claim to explain the LALALA, but most will just say what it is. This note tries to explain what it means: the 1 minute average may tell you what the current load is and combined with the 5 or 15 minute load, you see a trend. But why would you need all three?

  7. If the load averages > cpu numbers.
    You can check cpu/mem/io from top.Who is more busy who should be the bottleneck.

  8. Hi,

    I have a Linux server and a few remote PC’s that display data using a browser. There are time (too many) where the brower displays the message “Server took too long to respond, timed out”.

    So I have started to look at how ther server is performing using the “top” command.

    The load average states: 0.97, 0.45, 0.32. Would this in any way suggest that the server is having issues.

    Other stats if helpful:

    CPU 0.7$us, 0.5%sy, 0.0%ni, 98.3%id, 0.2%wa, 0.8%hi, 0.2%si, 0.0%st

    Any advive you could provide would be great.

    Cheers,

    Dereck

Leave a Reply

Your email address will not be published. Required fields are marked *