Load Averages on Linux/UNIX systems

When you run uptime or top, 3 load average numbers are displayed. They represent the 1, 5, and 15 minute load averages. In the following example, the 1 minute load average of my computer is 1.11.

▶ uptime
 22:31:24 up 13 days, 14:32,  1 user,  load average: 1.11, 0.56, 0.31

For years, I’ve used it in relative term. When a high number is shown and the system is still responsive, I set that as normal in mind. But that can’t be right. That is got to be a more scientific way of explaining the numbers. Turns out the answer is right in the man page of uptime.

System load averages is the average number of processes that are either in a runnable or uninterruptable state. A process in a runnable state is either using the CPU or waiting to use the CPU. A process in un‐ interruptable state is waiting for some I/O access, eg waiting for disk. The averages are taken over the three time intervals. Load averages are not normalized for the number of CPUs in a system, so a load aver‐ age of 1 means a single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.
man uptime

On a generic workload, let’s simplify the calculation and assume load averages is computed based solely on CPU usage. On a single core system, load averages of 1 means it’s completely busy with that 1 process for the past period of time. On a 16-core system, a load average of 16 the performance expectation should be the same. When interpreting the numbers on a monitoring system or setting up thresholds, we can divide the result by number of cores and multiple it with 100%.

Load averages is actually influenced not just by CPU. IO, network busyness, interrupts, or any other busy resource that prevents processes from completing all contribute. A process can be busy waiting for other resources to return. On some VM, even the random number pool can put the CPU on wait.

That said, load averages is a fairly holistic assessment of how busy a system is.