Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) Capacity Aware Scheduling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) 1. CPU Capacity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 1.1 Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) Conventional, homogeneous SMP platforms are composed of purely identical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) CPUs. Heterogeneous platforms on the other hand are composed of CPUs with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) different performance characteristics - on such platforms, not all CPUs can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) considered equal.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) CPU capacity is a measure of the performance a CPU can reach, normalized against
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) the most performant CPU in the system. Heterogeneous systems are also called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) asymmetric CPU capacity systems, as they contain CPUs of different capacities.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) Disparity in maximum attainable performance (IOW in maximum CPU capacity) stems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) from two factors:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) - not all CPUs may have the same microarchitecture (µarch).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) - with Dynamic Voltage and Frequency Scaling (DVFS), not all CPUs may be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25)   physically able to attain the higher Operating Performance Points (OPP).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) Arm big.LITTLE systems are an example of both. The big CPUs are more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) performance-oriented than the LITTLE ones (more pipeline stages, bigger caches,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) smarter predictors, etc), and can usually reach higher OPPs than the LITTLE ones
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) can.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) CPU performance is usually expressed in Millions of Instructions Per Second
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) (MIPS), which can also be expressed as a given amount of instructions attainable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) per Hz, leading to::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)   capacity(cpu) = work_per_hz(cpu) * max_freq(cpu)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 1.2 Scheduler terms
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) Two different capacity values are used within the scheduler. A CPU's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) ``capacity_orig`` is its maximum attainable capacity, i.e. its maximum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) attainable performance level. A CPU's ``capacity`` is its ``capacity_orig`` to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) which some loss of available performance (e.g. time spent handling IRQs) is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) subtracted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) Note that a CPU's ``capacity`` is solely intended to be used by the CFS class,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) while ``capacity_orig`` is class-agnostic. The rest of this document will use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) the term ``capacity`` interchangeably with ``capacity_orig`` for the sake of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) brevity.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 1.3 Platform examples
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 1.3.1 Identical OPPs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) ~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) Consider an hypothetical dual-core asymmetric CPU capacity system where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) - work_per_hz(CPU0) = W
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) - work_per_hz(CPU1) = W/2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) - all CPUs are running at the same fixed frequency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) By the above definition of capacity:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) - capacity(CPU0) = C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) - capacity(CPU1) = C/2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) To draw the parallel with Arm big.LITTLE, CPU0 would be a big while CPU1 would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) be a LITTLE.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) With a workload that periodically does a fixed amount of work, you will get an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) execution trace like so::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75)  CPU0 work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76)            |     ____                ____                ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77)            |    |    |              |    |              |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80)  CPU1 work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81)            |     _________           _________           ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82)            |    |         |         |         |         |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) CPU0 has the highest capacity in the system (C), and completes a fixed amount of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) work W in T units of time. On the other hand, CPU1 has half the capacity of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) CPU0, and thus only completes W/2 in T.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 1.3.2 Different max OPPs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) ~~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) Usually, CPUs of different capacity values also have different maximum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) OPPs. Consider the same CPUs as above (i.e. same work_per_hz()) with:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) - max_freq(CPU0) = F
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) - max_freq(CPU1) = 2/3 * F
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) This yields:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) - capacity(CPU0) = C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) - capacity(CPU1) = C/3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) Executing the same workload as described in 1.3.1, which each CPU running at its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) maximum frequency results in::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)  CPU0 work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)            |     ____                ____                ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)            |    |    |              |    |              |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)                             workload on CPU1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)  CPU1 work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)            |     ______________      ______________      ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)            |    |              |    |              |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 1.4 Representation caveat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) -------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) It should be noted that having a *single* value to represent differences in CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) performance is somewhat of a contentious point. The relative performance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) difference between two different µarchs could be X% on integer operations, Y% on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) floating point operations, Z% on branches, and so on. Still, results using this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) simple approach have been satisfactory for now.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 2. Task utilization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 2.1 Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) Capacity aware scheduling requires an expression of a task's requirements with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) regards to CPU capacity. Each scheduler class can express this differently, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) while task utilization is specific to CFS, it is convenient to describe it here
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) in order to introduce more generic concepts.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) Task utilization is a percentage meant to represent the throughput requirements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) of a task. A simple approximation of it is the task's duty cycle, i.e.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)   task_util(p) = duty_cycle(p)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) On an SMP system with fixed frequencies, 100% utilization suggests the task is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) busy loop. Conversely, 10% utilization hints it is a small periodic task that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) spends more time sleeping than executing. Variable CPU frequencies and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) asymmetric CPU capacities complexify this somewhat; the following sections will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) expand on these.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 2.2 Frequency invariance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) One issue that needs to be taken into account is that a workload's duty cycle is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) directly impacted by the current OPP the CPU is running at. Consider running a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) periodic workload at a given frequency F::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)   CPU work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)            |     ____                ____                ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)            |    |    |              |    |              |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) This yields duty_cycle(p) == 25%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) Now, consider running the *same* workload at frequency F/2::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164)   CPU work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)            |     _________           _________           ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)            |    |         |         |         |         |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) This yields duty_cycle(p) == 50%, despite the task having the exact same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) behaviour (i.e. executing the same amount of work) in both executions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) The task utilization signal can be made frequency invariant using the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) formula::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)   task_util_freq_inv(p) = duty_cycle(p) * (curr_frequency(cpu) / max_frequency(cpu))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) Applying this formula to the two examples above yields a frequency invariant
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) task utilization of 25%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) 2.3 CPU invariance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) ------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) CPU capacity has a similar effect on task utilization in that running an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) identical workload on CPUs of different capacity values will yield different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) duty cycles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) Consider the system described in 1.3.2., i.e.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) - capacity(CPU0) = C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) - capacity(CPU1) = C/3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) Executing a given periodic workload on each CPU at their maximum frequency would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) result in::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)  CPU0 work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196)            |     ____                ____                ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197)            |    |    |              |    |              |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200)  CPU1 work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201)            |     ______________      ______________      ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202)            |    |              |    |              |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) IOW,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) - duty_cycle(p) == 25% if p runs on CPU0 at its maximum frequency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) - duty_cycle(p) == 75% if p runs on CPU1 at its maximum frequency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) The task utilization signal can be made CPU invariant using the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) formula::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213)   task_util_cpu_inv(p) = duty_cycle(p) * (capacity(cpu) / max_capacity)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) with ``max_capacity`` being the highest CPU capacity value in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) system. Applying this formula to the above example above yields a CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) invariant task utilization of 25%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) 2.4 Invariant task utilization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) ------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) Both frequency and CPU invariance need to be applied to task utilization in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) order to obtain a truly invariant signal. The pseudo-formula for a task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) utilization that is both CPU and frequency invariant is thus, for a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) task p::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)                                      curr_frequency(cpu)   capacity(cpu)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228)   task_util_inv(p) = duty_cycle(p) * ------------------- * -------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)                                      max_frequency(cpu)    max_capacity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) In other words, invariant task utilization describes the behaviour of a task as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) if it were running on the highest-capacity CPU in the system, running at its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) maximum frequency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) Any mention of task utilization in the following sections will imply its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) invariant form.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 2.5 Utilization estimation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) Without a crystal ball, task behaviour (and thus task utilization) cannot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) accurately be predicted the moment a task first becomes runnable. The CFS class
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) maintains a handful of CPU and task signals based on the Per-Entity Load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) Tracking (PELT) mechanism, one of those yielding an *average* utilization (as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) opposed to instantaneous).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) This means that while the capacity aware scheduling criteria will be written
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) considering a "true" task utilization (using a crystal ball), the implementation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) will only ever be able to use an estimator thereof.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) 3. Capacity aware scheduling requirements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) =========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 3.1 CPU capacity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) Linux cannot currently figure out CPU capacity on its own, this information thus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) needs to be handed to it. Architectures must define arch_scale_cpu_capacity()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) for that purpose.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) The arm and arm64 architectures directly map this to the arch_topology driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) CPU scaling data, which is derived from the capacity-dmips-mhz CPU binding; see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) Documentation/devicetree/bindings/arm/cpu-capacity.txt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) 3.2 Frequency invariance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) As stated in 2.2, capacity-aware scheduling requires a frequency-invariant task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) utilization. Architectures must define arch_scale_freq_capacity(cpu) for that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) purpose.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) Implementing this function requires figuring out at which frequency each CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) have been running at. One way to implement this is to leverage hardware counters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) whose increment rate scale with a CPU's current frequency (APERF/MPERF on x86,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) AMU on arm64). Another is to directly hook into cpufreq frequency transitions,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) when the kernel is aware of the switched-to frequency (also employed by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) arm/arm64).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) 4. Scheduler topology
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) During the construction of the sched domains, the scheduler will figure out
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) whether the system exhibits asymmetric CPU capacities. Should that be the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) case:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) - The sched_asym_cpucapacity static key will be enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) - The SD_ASYM_CPUCAPACITY flag will be set at the lowest sched_domain level that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)   spans all unique CPU capacity values.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) The sched_asym_cpucapacity static key is intended to guard sections of code that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) cater to asymmetric CPU capacity systems. Do note however that said key is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) *system-wide*. Imagine the following setup using cpusets::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294)   capacity    C/2          C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)             ________    ________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296)            /        \  /        \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297)   CPUs     0  1  2  3  4  5  6  7
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298)            \__/  \______________/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299)   cpusets   cs0         cs1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) Which could be created via:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) .. code-block:: sh
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305)   mkdir /sys/fs/cgroup/cpuset/cs0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)   echo 0-1 > /sys/fs/cgroup/cpuset/cs0/cpuset.cpus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307)   echo 0 > /sys/fs/cgroup/cpuset/cs0/cpuset.mems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309)   mkdir /sys/fs/cgroup/cpuset/cs1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)   echo 2-7 > /sys/fs/cgroup/cpuset/cs1/cpuset.cpus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311)   echo 0 > /sys/fs/cgroup/cpuset/cs1/cpuset.mems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313)   echo 0 > /sys/fs/cgroup/cpuset/cpuset.sched_load_balance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) Since there *is* CPU capacity asymmetry in the system, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) sched_asym_cpucapacity static key will be enabled. However, the sched_domain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) hierarchy of CPUs 0-1 spans a single capacity value: SD_ASYM_CPUCAPACITY isn't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) set in that hierarchy, it describes an SMP island and should be treated as such.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) Therefore, the 'canonical' pattern for protecting codepaths that cater to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) asymmetric CPU capacities is to:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) - Check the sched_asym_cpucapacity static key
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) - If it is enabled, then also check for the presence of SD_ASYM_CPUCAPACITY in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325)   the sched_domain hierarchy (if relevant, i.e. the codepath targets a specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326)   CPU or group thereof)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 5. Capacity aware scheduling implementation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) ===========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 5.1 CFS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) 5.1.1 Capacity fitness
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) ~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) The main capacity scheduling criterion of CFS is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339)   task_util(p) < capacity(task_cpu(p))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) This is commonly called the capacity fitness criterion, i.e. CFS must ensure a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) task "fits" on its CPU. If it is violated, the task will need to achieve more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) work than what its CPU can provide: it will be CPU-bound.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) Furthermore, uclamp lets userspace specify a minimum and a maximum utilization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) value for a task, either via sched_setattr() or via the cgroup interface (see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) Documentation/admin-guide/cgroup-v2.rst). As its name imply, this can be used to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) clamp task_util() in the previous criterion.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) 5.1.2 Wakeup CPU selection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) ~~~~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) CFS task wakeup CPU selection follows the capacity fitness criterion described
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) above. On top of that, uclamp is used to clamp the task utilization values,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) which lets userspace have more leverage over the CPU selection of CFS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) tasks. IOW, CFS wakeup CPU selection searches for a CPU that satisfies::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358)   clamp(task_util(p), task_uclamp_min(p), task_uclamp_max(p)) < capacity(cpu)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) By using uclamp, userspace can e.g. allow a busy loop (100% utilization) to run
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) on any CPU by giving it a low uclamp.max value. Conversely, it can force a small
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) periodic task (e.g. 10% utilization) to run on the highest-performance CPUs by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) giving it a high uclamp.min value.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367)   Wakeup CPU selection in CFS can be eclipsed by Energy Aware Scheduling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368)   (EAS), which is described in Documentation/scheduler/sched-energy.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) 5.1.3 Load balancing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) ~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) A pathological case in the wakeup CPU selection occurs when a task rarely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) sleeps, if at all - it thus rarely wakes up, if at all. Consider::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376)   w == wakeup event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378)   capacity(CPU0) = C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379)   capacity(CPU1) = C / 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381)                            workload on CPU0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382)   CPU work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383)            |     _________           _________           ____
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384)            |    |         |         |         |         |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385)            +----+----+----+----+----+----+----+----+----+----+-> time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)                 w                   w                   w
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388)                            workload on CPU1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389)   CPU work ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)            |     ____________________________________________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391)            |    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392)            +----+----+----+----+----+----+----+----+----+----+->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393)                 w
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) This workload should run on CPU0, but if the task either:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) - was improperly scheduled from the start (inaccurate initial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398)   utilization estimation)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) - was properly scheduled from the start, but suddenly needs more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400)   processing power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) then it might become CPU-bound, IOW ``task_util(p) > capacity(task_cpu(p))``;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) the CPU capacity scheduling criterion is violated, and there may not be any more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) wakeup event to fix this up via wakeup CPU selection.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) Tasks that are in this situation are dubbed "misfit" tasks, and the mechanism
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) put in place to handle this shares the same name. Misfit task migration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) leverages the CFS load balancer, more specifically the active load balance part
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) (which caters to migrating currently running tasks). When load balance happens,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) a misfit active load balance will be triggered if a misfit task can be migrated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) to a CPU with more capacity than its current one.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) 5.2 RT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) ------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) 5.2.1 Wakeup CPU selection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) ~~~~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) RT task wakeup CPU selection searches for a CPU that satisfies::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421)   task_uclamp_min(p) <= capacity(task_cpu(cpu))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) while still following the usual priority constraints. If none of the candidate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) CPUs can satisfy this capacity criterion, then strict priority based scheduling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) is followed and CPU capacities are ignored.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) 5.3 DL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) ------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) 5.3.1 Wakeup CPU selection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) ~~~~~~~~~~~~~~~~~~~~~~~~~~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) DL task wakeup CPU selection searches for a CPU that satisfies::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435)   task_bandwidth(p) < capacity(task_cpu(p))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) while still respecting the usual bandwidth and deadline constraints. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) none of the candidate CPUs can satisfy this capacity criterion, then the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) task will remain on its current CPU.