Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) The KVM halt polling system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) The KVM halt polling system provides a feature within KVM whereby the latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) of a guest can, under some circumstances, be reduced by polling in the host
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) for some time period after the guest has elected to no longer run by cedeing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) That is, when a guest vcpu has ceded, or in the case of powerpc when all of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) vcpus of a single vcore have ceded, the host kernel polls for wakeup conditions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) before giving up the cpu to the scheduler in order to let something else run.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) Polling provides a latency advantage in cases where the guest can be run again
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) very quickly by at least saving us a trip through the scheduler, normally on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) the order of a few micro-seconds, although performance benefits are workload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) dependant. In the event that no wakeup source arrives during the polling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) interval or some other task on the runqueue is runnable the scheduler is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) invoked. Thus halt polling is especially useful on workloads with very short
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) wakeup periods where the time spent halt polling is minimised and the time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) savings of not invoking the scheduler are distinguishable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) The generic halt polling code is implemented in:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 	virt/kvm/kvm_main.c: kvm_vcpu_block()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) The powerpc kvm-hv specific case is implemented in:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 	arch/powerpc/kvm/book3s_hv.c: kvmppc_vcore_blocked()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) Halt Polling Interval
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) The maximum time for which to poll before invoking the scheduler, referred to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) as the halt polling interval, is increased and decreased based on the perceived
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) effectiveness of the polling in an attempt to limit pointless polling.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) This value is stored in either the vcpu struct:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 	kvm_vcpu->halt_poll_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) or in the case of powerpc kvm-hv, in the vcore struct:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 	kvmppc_vcore->halt_poll_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) Thus this is a per vcpu (or vcore) value.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) During polling if a wakeup source is received within the halt polling interval,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) the interval is left unchanged. In the event that a wakeup source isn't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) received during the polling interval (and thus schedule is invoked) there are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) two options, either the polling interval and total block time[0] were less than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) the global max polling interval (see module params below), or the total block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) time was greater than the global max polling interval.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) In the event that both the polling interval and total block time were less than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) the global max polling interval then the polling interval can be increased in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) the hope that next time during the longer polling interval the wake up source
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) will be received while the host is polling and the latency benefits will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) received. The polling interval is grown in the function grow_halt_poll_ns() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) is multiplied by the module parameters halt_poll_ns_grow and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) halt_poll_ns_grow_start.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) In the event that the total block time was greater than the global max polling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) interval then the host will never poll for long enough (limited by the global
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) max) to wakeup during the polling interval so it may as well be shrunk in order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) to avoid pointless polling. The polling interval is shrunk in the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) shrink_halt_poll_ns() and is divided by the module parameter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) halt_poll_ns_shrink, or set to 0 iff halt_poll_ns_shrink == 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) It is worth noting that this adjustment process attempts to hone in on some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) steady state polling interval but will only really do a good job for wakeups
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) which come at an approximately constant rate, otherwise there will be constant
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) adjustment of the polling interval.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) [0] total block time:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 		      the time between when the halt polling function is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 		      invoked and a wakeup source received (irrespective of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 		      whether the scheduler is invoked within that function).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) Module Parameters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) The kvm module has 3 tuneable module parameters to adjust the global max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) polling interval as well as the rate at which the polling interval is grown and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) shrunk. These variables are defined in include/linux/kvm_host.h and as module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) powerpc kvm-hv case.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) +-----------------------+---------------------------+-------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) |Module Parameter	|   Description		    |	     Default Value    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) +-----------------------+---------------------------+-------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) |halt_poll_ns		| The global max polling    | KVM_HALT_POLL_NS_DEFAULT|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) |			| interval which defines    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) |			| the ceiling value of the  |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) |			| polling interval for      | (per arch value)	      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) |			| each vcpu.		    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) +-----------------------+---------------------------+-------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) |halt_poll_ns_grow	| The value by which the    | 2			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) |			| halt polling interval is  |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) |			| multiplied in the	    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) |			| grow_halt_poll_ns()	    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) |			| function.		    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) +-----------------------+---------------------------+-------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) |halt_poll_ns_grow_start| The initial value to grow | 10000		      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) |			| to from zero in the	    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) |			| grow_halt_poll_ns()	    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) |			| function.		    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) +-----------------------+---------------------------+-------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) |halt_poll_ns_shrink	| The value by which the    | 0			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) |			| halt polling interval is  |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) |			| divided in the	    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) |			| shrink_halt_poll_ns()	    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) |			| function.		    |			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) +-----------------------+---------------------------+-------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) These module parameters can be set from the debugfs files in:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 	/sys/module/kvm/parameters/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) Note: that these module parameters are system wide values and are not able to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120)       be tuned on a per vm basis.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) Further Notes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) =============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) - Care should be taken when setting the halt_poll_ns module parameter as a large value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)   has the potential to drive the cpu usage to 100% on a machine which would be almost
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)   entirely idle otherwise. This is because even if a guest has wakeups during which very
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)   little work is done and which are quite far apart, if the period is shorter than the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)   global max polling interval (halt_poll_ns) then the host will always poll for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130)   entire block time and thus cpu utilisation will go to 100%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) - Halt polling essentially presents a trade off between power usage and latency and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)   the module parameters should be used to tune the affinity for this. Idle cpu time is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)   essentially converted to host kernel time with the aim of decreasing latency when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)   entering the guest.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) - Halt polling will only be conducted by the host when no other tasks are runnable on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)   that cpu, otherwise the polling will cease immediately and schedule will be invoked to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)   allow that other task to run. Thus this doesn't allow a guest to denial of service the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)   cpu.