^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) Nitro Enclaves
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Nitro Enclaves (NE) is a new Amazon Elastic Compute Cloud (EC2) capability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) that allows customers to carve out isolated compute environments within EC2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) instances [1].
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) For example, an application that processes sensitive data and runs in a VM,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) can be separated from other applications running in the same VM. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) application then runs in a separate VM than the primary VM, namely an enclave.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) An enclave runs alongside the VM that spawned it. This setup matches low latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) applications needs. The resources that are allocated for the enclave, such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) memory and CPUs, are carved out of the primary VM. Each enclave is mapped to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) process running in the primary VM, that communicates with the NE driver via an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) ioctl interface.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) In this sense, there are two components:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) 1. An enclave abstraction process - a user space process running in the primary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) VM guest that uses the provided ioctl interface of the NE driver to spawn an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) enclave VM (that's 2 below).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) There is a NE emulated PCI device exposed to the primary VM. The driver for this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) new PCI device is included in the NE driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) The ioctl logic is mapped to PCI device commands e.g. the NE_START_ENCLAVE ioctl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) maps to an enclave start PCI command. The PCI device commands are then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) translated into actions taken on the hypervisor side; that's the Nitro
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) hypervisor running on the host where the primary VM is running. The Nitro
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) hypervisor is based on core KVM technology.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) 2. The enclave itself - a VM running on the same host as the primary VM that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) spawned it. Memory and CPUs are carved out of the primary VM and are dedicated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) for the enclave VM. An enclave does not have persistent storage attached.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) The memory regions carved out of the primary VM and given to an enclave need to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) be aligned 2 MiB / 1 GiB physically contiguous memory regions (or multiple of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) this size e.g. 8 MiB). The memory can be allocated e.g. by using hugetlbfs from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) user space [2][3]. The memory size for an enclave needs to be at least 64 MiB.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) The enclave memory and CPUs need to be from the same NUMA node.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) An enclave runs on dedicated cores. CPU 0 and its CPU siblings need to remain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) available for the primary VM. A CPU pool has to be set for NE purposes by an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) user with admin capability. See the cpu list section from the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) documentation [4] for how a CPU pool format looks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) An enclave communicates with the primary VM via a local communication channel,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) using virtio-vsock [5]. The primary VM has virtio-pci vsock emulated device,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) while the enclave VM has a virtio-mmio vsock emulated device. The vsock device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) uses eventfd for signaling. The enclave VM sees the usual interfaces - local
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) APIC and IOAPIC - to get interrupts from virtio-vsock device. The virtio-mmio
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) device is placed in memory below the typical 4 GiB.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) The application that runs in the enclave needs to be packaged in an enclave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) image together with the OS ( e.g. kernel, ramdisk, init ) that will run in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) enclave VM. The enclave VM has its own kernel and follows the standard Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) boot protocol [6].
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) The kernel bzImage, the kernel command line, the ramdisk(s) are part of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) Enclave Image Format (EIF); plus an EIF header including metadata such as magic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) number, eif version, image size and CRC.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) Hash values are computed for the entire enclave image (EIF), the kernel and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) ramdisk(s). That's used, for example, to check that the enclave image that is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) loaded in the enclave VM is the one that was intended to be run.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) These crypto measurements are included in a signed attestation document
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) generated by the Nitro Hypervisor and further used to prove the identity of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) enclave; KMS is an example of service that NE is integrated with and that checks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) the attestation doc.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) The enclave image (EIF) is loaded in the enclave memory at offset 8 MiB. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) init process in the enclave connects to the vsock CID of the primary VM and a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) predefined port - 9000 - to send a heartbeat value - 0xb7. This mechanism is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) used to check in the primary VM that the enclave has booted. The CID of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) primary VM is 3.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) If the enclave VM crashes or gracefully exits, an interrupt event is received by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) the NE driver. This event is sent further to the user space enclave process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) running in the primary VM via a poll notification mechanism. Then the user space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) enclave process can exit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) [1] https://aws.amazon.com/ec2/nitro/nitro-enclaves/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) [2] https://www.kernel.org/doc/html/latest/admin-guide/mm/hugetlbpage.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) [3] https://lwn.net/Articles/807108/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) [4] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) [5] https://man7.org/linux/man-pages/man7/vsock.7.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) [6] https://www.kernel.org/doc/html/latest/x86/boot.html