^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) s390 (IBM Z) Ultravisor and Protected VMs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) =========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Summary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) Protected virtual machines (PVM) are KVM VMs that do not allow KVM to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) access VM state like guest memory or guest registers. Instead, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) PVMs are mostly managed by a new entity called Ultravisor (UV). The UV
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) provides an API that can be used by PVMs and KVM to request management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) actions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) Each guest starts in non-protected mode and then may make a request to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) transition into protected mode. On transition, KVM registers the guest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) and its VCPUs with the Ultravisor and prepares everything for running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) The Ultravisor will secure and decrypt the guest's boot memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) (i.e. kernel/initrd). It will safeguard state changes like VCPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) starts/stops and injected interrupts while the guest is running.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) As access to the guest's state, such as the SIE state description, is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) normally needed to be able to run a VM, some changes have been made in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) the behavior of the SIE instruction. A new format 4 state description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) has been introduced, where some fields have different meanings for a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) PVM. SIE exits are minimized as much as possible to improve speed and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) reduce exposed guest state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) Interrupt injection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) Interrupt injection is safeguarded by the Ultravisor. As KVM doesn't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) have access to the VCPUs' lowcores, injection is handled via the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) format 4 state description.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) Machine check, external, IO and restart interruptions each can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) injected on SIE entry via a bit in the interrupt injection control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) field (offset 0x54). If the guest cpu is not enabled for the interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) at the time of injection, a validity interception is recognized. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) format 4 state description contains fields in the interception data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) block where data associated with the interrupt can be transported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) Program and Service Call exceptions have another layer of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) safeguarding; they can only be injected for instructions that have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) been intercepted into KVM. The exceptions need to be a valid outcome
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) of an instruction emulation by KVM, e.g. we can never inject a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) addressing exception as they are reported by SIE since KVM has no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) access to the guest memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) Mask notification interceptions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) -------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) KVM cannot intercept lctl(g) and lpsw(e) anymore in order to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) notified when a PVM enables a certain class of interrupt. As a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) replacement, two new interception codes have been introduced: One
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) indicating that the contents of CRs 0, 6, or 14 have been changed,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) indicating different interruption subclasses; and one indicating that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) PSW bit 13 has been changed, indicating that a machine check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) intervention was requested and those are now enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) Instruction emulation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) With the format 4 state description for PVMs, the SIE instruction already
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) interprets more instructions than it does with format 2. It is not able
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) to interpret every instruction, but needs to hand some tasks to KVM;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) therefore, the SIE and the ultravisor safeguard emulation inputs and outputs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) The control structures associated with SIE provide the Secure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) Instruction Data Area (SIDA), the Interception Parameters (IP) and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) Secure Interception General Register Save Area. Guest GRs and most of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) the instruction data, such as I/O data structures, are filtered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) Instruction data is copied to and from the SIDA when needed. Guest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) GRs are put into / retrieved from the Secure Interception General
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) Register Save Area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) Only GR values needed to emulate an instruction will be copied into this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) save area and the real register numbers will be hidden.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) The Interception Parameters state description field still contains
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) the bytes of the instruction text, but with pre-set register values
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) instead of the actual ones. I.e. each instruction always uses the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) instruction text, in order not to leak guest instruction text.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) This also implies that the register content that a guest had in r<n>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) may be in r<m> from the hypervisor's point of view.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) The Secure Instruction Data Area contains instruction storage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) data. Instruction data, i.e. data being referenced by an instruction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) like the SCCB for sclp, is moved via the SIDA. When an instruction is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) intercepted, the SIE will only allow data and program interrupts for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) this instruction to be moved to the guest via the two data areas
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) discussed before. Other data is either ignored or results in validity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) interceptions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) Instruction emulation interceptions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) -----------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) There are two types of SIE secure instruction intercepts: the normal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) and the notification type. Normal secure instruction intercepts will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) make the guest pending for instruction completion of the intercepted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) instruction type, i.e. on SIE entry it is attempted to complete
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) emulation of the instruction with the data provided by KVM. That might
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) be a program exception or instruction completion.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) The notification type intercepts inform KVM about guest environment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) changes due to guest instruction interpretation. Such an interception
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) is recognized, for example, for the store prefix instruction to provide
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) the new lowcore location. On SIE reentry, any KVM data in the data areas
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) is ignored and execution continues as if the guest instruction had
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) completed. For that reason KVM is not allowed to inject a program
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) interrupt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) Links
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) `KVM Forum 2019 presentation <https://static.sched.com/hosted_files/kvmforum2019/3b/ibm_protected_vms_s390x.pdf>`_