Linux "top" command: What are us, sy, ni, id, wa, hi, si and st (for CPU usage)?
hi
is the time spent processing hardware interrupts. Hardware interrupts are generated by hardware devices (network cards, keyboard controller, external timer, hardware sensors, ...) when they need to signal something to the CPU (data has arrived, for example).
Since these can happen very frequently, and since they essentially block the current CPU while they are running, kernel hardware interrupt handlers are written to be as fast and simple as possible.
If long or complex processing needs to be done, these tasks are deferred using a mechanism call softirqs
. These are scheduled independently, can run on any CPU, can even run concurrently (none of that is true of hardware interrupt handlers).
The part about hard IRQs blocking the current CPU, and the part about softirqs
being able to run anywhere are not exactly correct, there can be limitations, and some hard IRQs can interrupt others.
As an example, a "data received" hardware interrupt from a network card could simply store the information "card ethX needs to be serviced" somewhere and schedule a softirq
. The softirq
would be the thing that triggers the actual packet routing.
si
represents the time spent in these softirqs
.
A good read about the softirq
mechanism (with a bit of history too) is Matthew Wilcox's I'll Do It Later: Softirqs, Tasklets, Bottom Halves, Task Queues,
Work Queues and Timers (PDF, 64k).
st
, "steal time", is only relevant in virtualized environments. It represents time when the real CPU was not available to the current virtual machine — it was "stolen" from that VM by the hypervisor (either to run another VM, or for its own needs).
The CPU time accounting document from IBM has more information about steal time, and CPU accounting in virtualized environments. (It's aimed at zSeries type hardware, but the general idea is the same for most platforms.)
- us - Time spent in user space
- sy - Time spent in kernel space
- ni - Time spent running niced user processes (User defined priority)
- id - Time spent in idle operations
- wa - Time spent on waiting on IO peripherals (eg. disk)
- hi - Time spent handling hardware interrupt routines. (Whenever a peripheral unit want attention form the CPU, it literally pulls a line, to signal the CPU to service it)
- si - Time spent handling software interrupt routines. (a piece of code, calls an interrupt routine...)
- st - Time spent on involuntary waits by virtual cpu while hypervisor is servicing another processor (stolen from a virtual machine)
The "st" value can be simply explained by using an T2.micro EC2 instance from AWS.
In the AWS documentation you can read that you get only a 10% baseline performance per VCPU. This means that if you have a process which would consume a lot of cpu time, the "st" value will stay around 90 since you are only allowed to use 10% of the VCPU. The sum of the other values will stay around 10.
So AWS is using the hypervisor to only allow you access to a certain amount of computing power. It slows you down by intention since you are only using a low tier type of instance.
I hope this makes things a little bit easier to understand.