^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) CPU load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) Linux exports various bits of information via ``/proc/stat`` and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) ``/proc/uptime`` that userland tools, such as top(1), use to calculate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) the average time system spent in a particular state, for example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) $ iostat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Linux 2.6.18.3-exp (linmac) 02/20/2007
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) avg-cpu: %user %nice %system %iowait %steal %idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) 10.01 0.00 2.92 5.44 0.00 81.63
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Here the system thinks that over the default sampling period the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) system spent 10.01% of the time doing work in user space, 2.92% in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) kernel, and was overall 81.63% of the time idle.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) In most cases the ``/proc/stat`` information reflects the reality quite
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) closely, however due to the nature of how/when the kernel collects
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) this data sometimes it can not be trusted at all.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) So how is this information collected? Whenever timer interrupt is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) signalled the kernel looks what kind of task was running at this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) moment and increments the counter that corresponds to this tasks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) kind/state. The problem with this is that the system could have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) switched between various states multiple times between two timer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) interrupts yet the counter is incremented only for the last state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) If we imagine the system with one task that periodically burns cycles
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) in the following manner::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) time line between two timer interrupts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) |--------------------------------------|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) ^ ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) |_ something begins working |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) |_ something goes to sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) (only to be awaken quite soon)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) In the above situation the system will be 0% loaded according to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) ``/proc/stat`` (since the timer interrupt will always happen when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) system is executing the idle handler), but in reality the load is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) closer to 99%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) One can imagine many more situations where this behavior of the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) will lead to quite erratic information inside ``/proc/stat``::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) /* gcc -o hog smallhog.c */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) #include <time.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) #include <limits.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) #include <signal.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) #include <sys/time.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) #define HIST 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) static volatile sig_atomic_t stop;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) static void sighandler(int signr)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) (void) signr;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) stop = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) static unsigned long hog (unsigned long niters)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) stop = 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) while (!stop && --niters);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) return niters;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) int main (void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) int i;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) struct itimerval it = {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) .it_interval = { .tv_sec = 0, .tv_usec = 1 },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) .it_value = { .tv_sec = 0, .tv_usec = 1 } };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) sigset_t set;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) unsigned long v[HIST];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) double tmp = 0.0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) unsigned long n;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) signal(SIGALRM, &sighandler);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) setitimer(ITIMER_REAL, &it, NULL);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) hog (ULONG_MAX);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) for (i = 0; i < HIST; ++i) v[i] = ULONG_MAX - hog(ULONG_MAX);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) for (i = 0; i < HIST; ++i) tmp += v[i];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) tmp /= HIST;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) n = tmp - (tmp / 3.0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) sigemptyset(&set);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) sigaddset(&set, SIGALRM);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) for (;;) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) hog(n);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) sigwait(&set, &i);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) References
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) - http://lkml.org/lkml/2007/2/12/6
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) - Documentation/filesystems/proc.rst (1.8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) Thanks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) ------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) Con Kolivas, Pavel Machek