^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Scheduler Statistics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) Version 15 of schedstats dropped counters for some sched_yield:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) identical to version 14.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) Version 14 of schedstats includes support for sched_domains, which hit the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) mainline kernel in 2.6.20 although it is identical to the stats from version
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) 12 which was in the kernel from 2.6.13-2.6.19 (version 13 never saw a kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) release). Some counters make more sense to be per-runqueue; other to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) per-domain. Note that domains (and their associated information) will only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) be pertinent and available on machines utilizing CONFIG_SMP.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) In version 14 of schedstat, there is at least one level of domain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) statistics for each cpu listed, and there may well be more than one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) domain. Domains have no particular names in this implementation, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) the highest numbered one typically arbitrates balancing across all the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) cpus on the machine, while domain0 is the most tightly focused domain,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) sometimes balancing only between pairs of cpus. At this time, there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) are no architectures which need more than three domain levels. The first
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) field in the domain stats is a bit map indicating which cpus are affected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) by that domain.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) These fields are counters, and only increment. Programs which make use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) of these will need to start with a baseline observation and then calculate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) the change in the counters at each subsequent observation. A perl script
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) which does this for many of the fields is available at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) http://eaglet.pdxhosts.com/rick/linux/schedstat/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) Note that any such script will necessarily be version-specific, as the main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) reason to change versions is changes in the output format. For those wishing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) to write their own scripts, the fields are described here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) CPU statistics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) cpu<N> 1 2 3 4 5 6 7 8 9
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) First field is a sched_yield() statistic:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) 1) # of times sched_yield() was called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) Next three are schedule() statistics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) 2) This field is a legacy array expiration count field used in the O(1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) scheduler. We kept it for ABI compatibility, but it is always set to zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) 3) # of times schedule() was called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) 4) # of times schedule() left the processor idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) Next two are try_to_wake_up() statistics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) 5) # of times try_to_wake_up() was called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) 6) # of times try_to_wake_up() was called to wake up the local cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) Next three are statistics describing scheduling latency:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) 7) sum of all time spent running by tasks on this processor (in jiffies)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) 8) sum of all time spent waiting to run by tasks on this processor (in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) jiffies)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) 9) # of timeslices run on this cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) Domain statistics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) One of these is produced per domain for each cpu described. (Note that if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) CONFIG_SMP is not defined, *no* domains are utilized and these lines
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) will not appear in the output.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) The first field is a bit mask indicating what cpus this domain operates over.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) The next 24 are a variety of load_balance() statistics in grouped into types
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) of idleness (idle, busy, and newly idle):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) 1) # of times in this domain load_balance() was called when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) cpu was idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) 2) # of times in this domain load_balance() checked but found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) the load did not require balancing when the cpu was idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) 3) # of times in this domain load_balance() tried to move one or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) more tasks and failed, when the cpu was idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) 4) sum of imbalances discovered (if any) with each call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) load_balance() in this domain when the cpu was idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) 5) # of times in this domain pull_task() was called when the cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) was idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) 6) # of times in this domain pull_task() was called even though
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) the target task was cache-hot when idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) 7) # of times in this domain load_balance() was called but did
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) not find a busier queue while the cpu was idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) 8) # of times in this domain a busier queue was found while the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) cpu was idle but no busier group was found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) 9) # of times in this domain load_balance() was called when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) cpu was busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) 10) # of times in this domain load_balance() checked but found the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) load did not require balancing when busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) 11) # of times in this domain load_balance() tried to move one or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) more tasks and failed, when the cpu was busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 12) sum of imbalances discovered (if any) with each call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) load_balance() in this domain when the cpu was busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 13) # of times in this domain pull_task() was called when busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 14) # of times in this domain pull_task() was called even though the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) target task was cache-hot when busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 15) # of times in this domain load_balance() was called but did not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) find a busier queue while the cpu was busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 16) # of times in this domain a busier queue was found while the cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) was busy but no busier group was found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 17) # of times in this domain load_balance() was called when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) cpu was just becoming idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 18) # of times in this domain load_balance() checked but found the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) load did not require balancing when the cpu was just becoming idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 19) # of times in this domain load_balance() tried to move one or more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) tasks and failed, when the cpu was just becoming idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 20) sum of imbalances discovered (if any) with each call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) load_balance() in this domain when the cpu was just becoming idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 21) # of times in this domain pull_task() was called when newly idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 22) # of times in this domain pull_task() was called even though the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) target task was cache-hot when just becoming idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 23) # of times in this domain load_balance() was called but did not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) find a busier queue while the cpu was just becoming idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 24) # of times in this domain a busier queue was found while the cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) was just becoming idle but no busier group was found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) Next three are active_load_balance() statistics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 25) # of times active_load_balance() was called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 26) # of times active_load_balance() tried to move a task and failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 27) # of times active_load_balance() successfully moved a task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) Next three are sched_balance_exec() statistics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 28) sbe_cnt is not used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 29) sbe_balanced is not used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 30) sbe_pushed is not used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) Next three are sched_balance_fork() statistics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 31) sbf_cnt is not used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 32) sbf_balanced is not used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 33) sbf_pushed is not used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) Next three are try_to_wake_up() statistics:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 34) # of times in this domain try_to_wake_up() awoke a task that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) last ran on a different cpu in this domain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 35) # of times in this domain try_to_wake_up() moved a task to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) waking cpu because it was cache-cold on its own cpu anyway
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 36) # of times in this domain try_to_wake_up() started passive balancing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) /proc/<pid>/schedstat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) schedstats also adds a new /proc/<pid>/schedstat file to include some of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) the same information on a per-process level. There are three fields in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) this file correlating for that process to:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 1) time spent on the cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 2) time spent waiting on a runqueue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 3) # of timeslices run on this cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) A program could be easily written to make use of these extra fields to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) report on how well a particular process or set of processes is faring
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) under the scheduler's policies. A simple version of such a program is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) available at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) http://eaglet.pdxhosts.com/rick/linux/schedstat/v12/latency.c