^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) Energy Model of devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) 1. Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) The Energy Model (EM) framework serves as an interface between drivers knowing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) the power consumed by devices at various performance levels, and the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) subsystems willing to use that information to make energy-aware decisions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) The source of the information about the power consumed by devices can vary greatly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) from one platform to another. These power costs can be estimated using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) devicetree data in some cases. In others, the firmware will know better.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Alternatively, userspace might be best positioned. And so on. In order to avoid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) each and every client subsystem to re-implement support for each and every
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) possible source of information on its own, the EM framework intervenes as an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) abstraction layer which standardizes the format of power cost tables in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) kernel, hence enabling to avoid redundant work.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) The figure below depicts an example of drivers (Arm-specific here, but the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) approach is applicable to any architecture) providing power costs to the EM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) framework, and interested clients reading the data from it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) +---------------+ +-----------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) | Thermal (IPA) | | Scheduler (EAS) | | Other |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) +---------------+ +-----------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) | | em_cpu_energy() |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) | | em_cpu_get() |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) +---------+ | +---------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) v v v
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) +---------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) | Energy Model |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) | Framework |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) +---------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) ^ ^ ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) | | | em_dev_register_perf_domain()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) +----------+ | +---------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) +---------------+ +---------------+ +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) | cpufreq-dt | | arm_scmi | | Other |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) +---------------+ +---------------+ +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) ^ ^ ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) +--------------+ +---------------+ +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) | Device Tree | | Firmware | | ? |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) +--------------+ +---------------+ +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) In case of CPU devices the EM framework manages power cost tables per
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) 'performance domain' in the system. A performance domain is a group of CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) whose performance is scaled together. Performance domains generally have a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) 1-to-1 mapping with CPUFreq policies. All CPUs in a performance domain are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) required to have the same micro-architecture. CPUs in different performance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) domains can have different micro-architectures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) 2. Core APIs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) 2.1 Config options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) ^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) CONFIG_ENERGY_MODEL must be enabled to use the EM framework.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) 2.2 Registration of performance domains
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) Drivers are expected to register performance domains into the EM framework by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) calling the following API::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) struct em_data_callback *cb, cpumask_t *cpus);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) Drivers must provide a callback function returning <frequency, power> tuples
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) for each performance state. The callback function provided by the driver is free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) to fetch data from any relevant location (DT, firmware, ...), and by any mean
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) deemed necessary. Only for CPU devices, drivers must specify the CPUs of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) performance domains using cpumask. For other devices than CPUs the last
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) argument must be set to NULL.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) See Section 3. for an example of driver implementing this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) callback, and kernel/power/energy_model.c for further documentation on this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) API.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) 2.3 Accessing performance domains
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) There are two API functions which provide the access to the energy model:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) em_cpu_get() which takes CPU id as an argument and em_pd_get() with device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) pointer as an argument. It depends on the subsystem which interface it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) going to use, but in case of CPU devices both functions return the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) performance domain.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) Subsystems interested in the energy model of a CPU can retrieve it using the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) em_cpu_get() API. The energy model tables are allocated once upon creation of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) the performance domains, and kept in memory untouched.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) The energy consumed by a performance domain can be estimated using the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) em_cpu_energy() API. The estimation is performed assuming that the schedutil
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) CPUfreq governor is in use in case of CPU device. Currently this calculation is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) not provided for other type of devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) More details about the above APIs can be found in include/linux/energy_model.h.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 3. Example driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) This section provides a simple example of a CPUFreq driver registering a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) performance domain in the Energy Model framework using the (fake) 'foo'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) protocol. The driver implements an est_power() function to be provided to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) EM framework::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) -> drivers/cpufreq/foo_cpufreq.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 01 static int est_power(unsigned long *mW, unsigned long *KHz,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 02 struct device *dev)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 03 {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 04 long freq, power;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 05
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 06 /* Use the 'foo' protocol to ceil the frequency */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 07 freq = foo_get_freq_ceil(dev, *KHz);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 08 if (freq < 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 09 return freq;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 11 /* Estimate the power cost for the dev at the relevant freq. */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 12 power = foo_estimate_power(dev, freq);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 13 if (power < 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 14 return power;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 15
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 16 /* Return the values to the EM framework */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 17 *mW = power;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 18 *KHz = freq;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 19
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 20 return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 21 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 22
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 23 static int foo_cpufreq_init(struct cpufreq_policy *policy)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 24 {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 25 struct em_data_callback em_cb = EM_DATA_CB(est_power);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 26 struct device *cpu_dev;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 27 int nr_opp, ret;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 28
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 29 cpu_dev = get_cpu_device(cpumask_first(policy->cpus));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 30
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 31 /* Do the actual CPUFreq init work ... */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 32 ret = do_foo_cpufreq_init(policy);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 33 if (ret)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 34 return ret;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 35
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 36 /* Find the number of OPPs for this policy */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 37 nr_opp = foo_get_nr_opp(policy);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 38
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 39 /* And register the new performance domain */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 40 em_dev_register_perf_domain(cpu_dev, nr_opp, &em_cb, policy->cpus);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 41
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 42 return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 43 }