Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) KVM-specific MSRs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) :Author: Glauber Costa <glommer@redhat.com>, Red Hat Inc, 2010
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) KVM makes use of some custom MSRs to service some requests.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) Custom MSRs have a range reserved for them, that goes from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 0x4b564d00 to 0x4b564dff. There are MSRs outside this area,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) but they are deprecated and their use is discouraged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) Custom MSR list
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) The current supported Custom MSR list is:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) MSR_KVM_WALL_CLOCK_NEW:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 	0x4b564d00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 	4-byte alignment physical address of a memory area which must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 	in guest RAM. This memory is expected to hold a copy of the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 	structure::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 	 struct pvclock_wall_clock {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 		u32   version;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 		u32   sec;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 		u32   nsec;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 	  } __attribute__((__packed__));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 	whose data will be filled in by the hypervisor. The hypervisor is only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 	guaranteed to update this data at the moment of MSR write.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 	Users that want to reliably query this information more than once have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 	to write more than once to this MSR. Fields have the following meanings:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 	version:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 		guest has to check version before and after grabbing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) 		time information and check that they are both equal and even.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 		An odd version indicates an in-progress update.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 	sec:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 		 number of seconds for wallclock at time of boot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 	nsec:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 		 number of nanoseconds for wallclock at time of boot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 	In order to get the current wallclock time, the system_time from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 	MSR_KVM_SYSTEM_TIME_NEW needs to be added.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) 	Note that although MSRs are per-CPU entities, the effect of this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 	particular MSR is global.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 	Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 	leaf prior to usage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) MSR_KVM_SYSTEM_TIME_NEW:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 	0x4b564d01
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 	4-byte aligned physical address of a memory area which must be in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) 	guest RAM, plus an enable bit in bit 0. This memory is expected to hold
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 	a copy of the following structure::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 	  struct pvclock_vcpu_time_info {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 		u32   version;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 		u32   pad0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) 		u64   tsc_timestamp;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) 		u64   system_time;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) 		u32   tsc_to_system_mul;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 		s8    tsc_shift;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 		u8    flags;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 		u8    pad[2];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 	  } __attribute__((__packed__)); /* 32 bytes */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 	whose data will be filled in by the hypervisor periodically. Only one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 	write, or registration, is needed for each VCPU. The interval between
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 	updates of this structure is arbitrary and implementation-dependent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 	The hypervisor may update this structure at any time it sees fit until
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 	anything with bit0 == 0 is written to it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 	Fields have the following meanings:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) 	version:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 		guest has to check version before and after grabbing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 		time information and check that they are both equal and even.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 		An odd version indicates an in-progress update.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 	tsc_timestamp:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 		the tsc value at the current VCPU at the time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 		of the update of this structure. Guests can subtract this value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 		from current tsc to derive a notion of elapsed time since the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 		structure update.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 	system_time:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) 		a host notion of monotonic time, including sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 		time at the time this structure was last updated. Unit is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 		nanoseconds.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 	tsc_to_system_mul:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 		multiplier to be used when converting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 		tsc-related quantity to nanoseconds
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 	tsc_shift:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 		shift to be used when converting tsc-related
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 		quantity to nanoseconds. This shift will ensure that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 		multiplication with tsc_to_system_mul does not overflow.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 		A positive value denotes a left shift, a negative value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) 		a right shift.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 		The conversion from tsc to nanoseconds involves an additional
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 		right shift by 32 bits. With this information, guests can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 		derive per-CPU time by doing::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 			time = (current_tsc - tsc_timestamp)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 			if (tsc_shift >= 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 				time <<= tsc_shift;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 			else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 				time >>= -tsc_shift;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 			time = (time * tsc_to_system_mul) >> 32
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 			time = time + system_time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 	flags:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 		bits in this field indicate extended capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 		coordinated between the guest and the hypervisor. Availability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 		of specific flags has to be checked in 0x40000001 cpuid leaf.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 		Current flags are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 		+-----------+--------------+----------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 		| flag bit  | cpuid bit    | meaning			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 		+-----------+--------------+----------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 		|	    |		   | time measures taken across       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 		|    0      |	   24      | multiple cpus are guaranteed to  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 		|	    |		   | be monotonic		      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 		+-----------+--------------+----------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 		|	    |		   | guest vcpu has been paused by    |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 		|    1	    |	  N/A	   | the host			      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 		|	    |		   | See 4.70 in api.txt	      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 		+-----------+--------------+----------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 	Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 	leaf prior to usage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) MSR_KVM_WALL_CLOCK:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 	0x11
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) data and functioning:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 	same as MSR_KVM_WALL_CLOCK_NEW. Use that instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 	This MSR falls outside the reserved KVM range and may be removed in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 	future. Its usage is deprecated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 	Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 	leaf prior to usage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) MSR_KVM_SYSTEM_TIME:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 	0x12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) data and functioning:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) 	same as MSR_KVM_SYSTEM_TIME_NEW. Use that instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 	This MSR falls outside the reserved KVM range and may be removed in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) 	future. Its usage is deprecated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) 	Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) 	leaf prior to usage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) 	The suggested algorithm for detecting kvmclock presence is then::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 		if (!kvm_para_available())    /* refer to cpuid.txt */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) 			return NON_PRESENT;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) 		flags = cpuid_eax(0x40000001);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) 		if (flags & 3) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 			msr_kvm_system_time = MSR_KVM_SYSTEM_TIME_NEW;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) 			msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK_NEW;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) 			return PRESENT;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 		} else if (flags & 0) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) 			msr_kvm_system_time = MSR_KVM_SYSTEM_TIME;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) 			msr_kvm_wall_clock = MSR_KVM_WALL_CLOCK;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) 			return PRESENT;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) 		} else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 			return NON_PRESENT;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) MSR_KVM_ASYNC_PF_EN:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) 	0x4b564d02
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) 	Asynchronous page fault (APF) control MSR.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) 	Bits 63-6 hold 64-byte aligned physical address of a 64 byte memory area
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) 	which must be in guest RAM and must be zeroed. This memory is expected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) 	to hold a copy of the following structure::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 	  struct kvm_vcpu_pv_apf_data {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) 		/* Used for 'page not present' events delivered via #PF */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) 		__u32 flags;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) 		/* Used for 'page ready' events delivered via interrupt notification */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) 		__u32 token;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) 		__u8 pad[56];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) 		__u32 enabled;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) 	  };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) 	Bits 5-4 of the MSR are reserved and should be zero. Bit 0 is set to 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) 	when asynchronous page faults are enabled on the vcpu, 0 when disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) 	Bit 1 is 1 if asynchronous page faults can be injected when vcpu is in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) 	cpl == 0. Bit 2 is 1 if asynchronous page faults are delivered to L1 as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) 	#PF vmexits.  Bit 2 can be set only if KVM_FEATURE_ASYNC_PF_VMEXIT is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 	present in CPUID. Bit 3 enables interrupt based delivery of 'page ready'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) 	events. Bit 3 can only be set if KVM_FEATURE_ASYNC_PF_INT is present in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) 	CPUID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) 	'Page not present' events are currently always delivered as synthetic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) 	#PF exception. During delivery of these events APF CR2 register contains
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) 	a token that will be used to notify the guest when missing page becomes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) 	available. Also, to make it possible to distinguish between real #PF and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) 	APF, first 4 bytes of 64 byte memory location ('flags') will be written
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) 	to by the hypervisor at the time of injection. Only first bit of 'flags'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) 	is currently supported, when set, it indicates that the guest is dealing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) 	with asynchronous 'page not present' event. If during a page fault APF
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) 	'flags' is '0' it means that this is regular page fault. Guest is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) 	supposed to clear 'flags' when it is done handling #PF exception so the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) 	next event can be delivered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) 	Note, since APF 'page not present' events use the same exception vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) 	as regular page fault, guest must reset 'flags' to '0' before it does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) 	something that can generate normal page fault.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) 	Bytes 5-7 of 64 byte memory location ('token') will be written to by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) 	hypervisor at the time of APF 'page ready' event injection. The content
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 	of these bytes is a token which was previously delivered as 'page not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 	present' event. The event indicates the page in now available. Guest is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) 	supposed to write '0' to 'token' when it is done handling 'page ready'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 	event and to write 1' to MSR_KVM_ASYNC_PF_ACK after clearing the location;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) 	writing to the MSR forces KVM to re-scan its queue and deliver the next
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) 	pending notification.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) 	Note, MSR_KVM_ASYNC_PF_INT MSR specifying the interrupt vector for 'page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 	ready' APF delivery needs to be written to before enabling APF mechanism
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 	in MSR_KVM_ASYNC_PF_EN or interrupt #0 can get injected. The MSR is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) 	available if KVM_FEATURE_ASYNC_PF_INT is present in CPUID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) 	Note, previously, 'page ready' events were delivered via the same #PF
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) 	exception as 'page not present' events but this is now deprecated. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) 	bit 3 (interrupt based delivery) is not set APF events are not delivered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 	If APF is disabled while there are outstanding APFs, they will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 	not be delivered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 	Currently 'page ready' APF events will be always delivered on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) 	same vcpu as 'page not present' event was, but guest should not rely on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) 	that.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) MSR_KVM_STEAL_TIME:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) 	0x4b564d03
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) 	64-byte alignment physical address of a memory area which must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) 	in guest RAM, plus an enable bit in bit 0. This memory is expected to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) 	hold a copy of the following structure::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) 	  struct kvm_steal_time {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) 		__u64 steal;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) 		__u32 version;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) 		__u32 flags;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) 		__u8  preempted;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) 		__u8  u8_pad[3];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) 		__u32 pad[11];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) 	  }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) 	whose data will be filled in by the hypervisor periodically. Only one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) 	write, or registration, is needed for each VCPU. The interval between
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) 	updates of this structure is arbitrary and implementation-dependent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) 	The hypervisor may update this structure at any time it sees fit until
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) 	anything with bit0 == 0 is written to it. Guest is required to make sure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) 	this structure is initialized to zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 	Fields have the following meanings:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) 	version:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) 		a sequence counter. In other words, guest has to check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 		this field before and after grabbing time information and make
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) 		sure they are both equal and even. An odd version indicates an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) 		in-progress update.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) 	flags:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 		At this point, always zero. May be used to indicate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) 		changes in this structure in the future.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 	steal:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) 		the amount of time in which this vCPU did not run, in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) 		nanoseconds. Time during which the vcpu is idle, will not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) 		reported as steal time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) 	preempted:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) 		indicate the vCPU who owns this struct is running or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) 		not. Non-zero values mean the vCPU has been preempted. Zero
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) 		means the vCPU is not preempted. NOTE, it is always zero if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) 		the hypervisor doesn't support this field.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) MSR_KVM_EOI_EN:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) 	0x4b564d04
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) 	Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 	when disabled.  Bit 1 is reserved and must be zero.  When PV end of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) 	interrupt is enabled (bit 0 set), bits 63-2 hold a 4-byte aligned
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) 	physical address of a 4 byte memory area which must be in guest RAM and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) 	must be zeroed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) 	The first, least significant bit of 4 byte memory location will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) 	written to by the hypervisor, typically at the time of interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) 	injection.  Value of 1 means that guest can skip writing EOI to the apic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) 	(using MSR or MMIO write); instead, it is sufficient to signal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) 	EOI by clearing the bit in guest memory - this location will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) 	later be polled by the hypervisor.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) 	Value of 0 means that the EOI write is required.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) 	It is always safe for the guest to ignore the optimization and perform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) 	the APIC EOI write anyway.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 	Hypervisor is guaranteed to only modify this least
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) 	significant bit while in the current VCPU context, this means that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 	guest does not need to use either lock prefix or memory ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 	primitives to synchronise with the hypervisor.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) 	However, hypervisor can set and clear this memory bit at any time:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) 	therefore to make sure hypervisor does not interrupt the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) 	guest and clear the least significant bit in the memory area
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 	in the window between guest testing it to detect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) 	whether it can skip EOI apic write and between guest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) 	clearing it to signal EOI to the hypervisor,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) 	guest must both read the least significant bit in the memory area and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) 	clear it using a single CPU instruction, such as test and clear, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) 	compare and exchange.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) MSR_KVM_POLL_CONTROL:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) 	0x4b564d05
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) 	Control host-side polling.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 	Bit 0 enables (1) or disables (0) host-side HLT polling logic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) 	KVM guests can request the host not to poll on HLT, for example if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) 	they are performing polling themselves.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) MSR_KVM_ASYNC_PF_INT:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) 	0x4b564d06
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) 	Second asynchronous page fault (APF) control MSR.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 	Bits 0-7: APIC vector for delivery of 'page ready' APF events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 	Bits 8-63: Reserved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) 	Interrupt vector for asynchnonous 'page ready' notifications delivery.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 	The vector has to be set up before asynchronous page fault mechanism
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) 	is enabled in MSR_KVM_ASYNC_PF_EN.  The MSR is only available if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 	KVM_FEATURE_ASYNC_PF_INT is present in CPUID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) MSR_KVM_ASYNC_PF_ACK:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) 	0x4b564d07
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) 	Asynchronous page fault (APF) acknowledgment.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) 	When the guest is done processing 'page ready' APF event and 'token'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) 	field in 'struct kvm_vcpu_pv_apf_data' is cleared it is supposed to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) 	write '1' to bit 0 of the MSR, this causes the host to re-scan its queue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) 	and check if there are more notifications pending. The MSR is available
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) 	if KVM_FEATURE_ASYNC_PF_INT is present in CPUID.