^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) CPU Idle Cooling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Situation:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Under certain circumstances a SoC can reach a critical temperature
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) limit and is unable to stabilize the temperature around a temperature
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) control. When the SoC has to stabilize the temperature, the kernel can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) act on a cooling device to mitigate the dissipated power. When the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) critical temperature is reached, a decision must be taken to reduce
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) the temperature, that, in turn impacts performance.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Another situation is when the silicon temperature continues to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) increase even after the dynamic leakage is reduced to its minimum by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) clock gating the component. This runaway phenomenon can continue due
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) to the static leakage. The only solution is to power down the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) component, thus dropping the dynamic and static leakage that will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) allow the component to cool down.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) Last but not least, the system can ask for a specific power budget but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) because of the OPP density, we can only choose an OPP with a power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) budget lower than the requested one and under-utilize the CPU, thus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) losing performance. In other words, one OPP under-utilizes the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) with a power less than the requested power budget and the next OPP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) exceeds the power budget. An intermediate OPP could have been used if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) it were present.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) Solutions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) If we can remove the static and the dynamic leakage for a specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) duration in a controlled period, the SoC temperature will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) decrease. Acting on the idle state duration or the idle cycle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) injection period, we can mitigate the temperature by modulating the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) power budget.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) The Operating Performance Point (OPP) density has a great influence on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) the control precision of cpufreq, however different vendors have a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) plethora of OPP density, and some have large power gap between OPPs,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) that will result in loss of performance during thermal control and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) loss of power in other scenarios.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) At a specific OPP, we can assume that injecting idle cycle on all CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) belong to the same cluster, with a duration greater than the cluster
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) idle state target residency, we lead to dropping the static and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) dynamic leakage for this period (modulo the energy needed to enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) this state). So the sustainable power with idle cycles has a linear
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) relation with the OPP’s sustainable power and can be computed with a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) coefficient similar to::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) Power(IdleCycle) = Coef x Power(OPP)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) Idle Injection:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) The base concept of the idle injection is to force the CPU to go to an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) idle state for a specified time each control cycle, it provides
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) another way to control CPU power and heat in addition to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) cpufreq. Ideally, if all CPUs belonging to the same cluster, inject
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) their idle cycles synchronously, the cluster can reach its power down
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) state with a minimum power consumption and reduce the static leakage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) to almost zero. However, these idle cycles injection will add extra
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) latencies as the CPUs will have to wakeup from a deep sleep state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) We use a fixed duration of idle injection that gives an acceptable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) performance penalty and a fixed latency. Mitigation can be increased
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) or decreased by modulating the duty cycle of the idle injection.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) |------- -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) |_______|_______________________|_______|___________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) <------>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) idle <---------------------->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) <----------------------------->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) duty cycle 25%
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) The implementation of the cooling device bases the number of states on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) the duty cycle percentage. When no mitigation is happening the cooling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) device state is zero, meaning the duty cycle is 0%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) When the mitigation begins, depending on the governor's policy, a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) starting state is selected. With a fixed idle duration and the duty
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) cycle (aka the cooling device state), the running duration can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) computed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) The governor will change the cooling device state thus the duty cycle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) and this variation will modulate the cooling effect.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) |------- -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) |_______|_______________|_______|___________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) <------>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) idle <-------------->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) <--------------------->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) duty cycle 33%
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) |------- -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) |_______|_______|_______|___________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) <------>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) idle <------>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) <------------->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) duty cycle 50%
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) The idle injection duration value must comply with the constraints:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) - It is less than or equal to the latency we tolerate when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) mitigation begins. It is platform dependent and will depend on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) user experience, reactivity vs performance trade off we want. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) value should be specified.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) - It is greater than the idle state’s target residency we want to go
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) for thermal mitigation, otherwise we end up consuming more energy.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) Power considerations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) When we reach the thermal trip point, we have to sustain a specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) power for a specific temperature but at this time we consume::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) Power = Capacitance x Voltage^2 x Frequency x Utilisation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) ... which is more than the sustainable power (or there is something
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) wrong in the system setup). The ‘Capacitance’ and ‘Utilisation’ are a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) because we don’t want to change the OPP. We can group the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) ‘Capacitance’ and the ‘Utilisation’ into a single term which is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) ‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) Pdyn = Cdyn x Voltage^2 x Frequency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) The power allocator governor will ask us somehow to reduce our power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) in order to target the sustainable power defined in the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) tree. So with the idle injection mechanism, we want an average power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) (Ptarget) resulting in an amount of time running at full power on a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) specific OPP and idle another amount of time. That could be put in a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) equation::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) P(opp)target = ((Trunning x (P(opp)running) + (Tidle x P(opp)idle)) /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) (Trunning + Tidle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) Tidle = Trunning x ((P(opp)running / P(opp)target) - 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) At this point if we know the running period for the CPU, that gives us
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) the idle injection we need. Alternatively if we have the idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) injection duration, we can compute the running duration with::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) Trunning = Tidle / ((P(opp)running / P(opp)target) - 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) Practically, if the running power is less than the targeted power, we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) end up with a negative time value, so obviously the equation usage is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) bound to a power reduction, hence a higher OPP is needed to have the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) running power greater than the targeted power.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) However, in this demonstration we ignore three aspects:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) * The static leakage is not defined here, we can introduce it in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) equation but assuming it will be zero most of the time as it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) difficult to get the values from the SoC vendors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) * The idle state wake up latency (or entry + exit latency) is not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) taken into account, it must be added in the equation in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) rigorously compute the idle injection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) * The injected idle duration must be greater than the idle state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) target residency, otherwise we end up consuming more energy and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) potentially invert the mitigation effect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) So the final equation is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) Trunning = (Tidle - Twakeup ) x
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )