Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) .. iommu:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) =====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) IOMMU Userspace API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) =====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) IOMMU UAPI is used for virtualization cases where communications are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) needed between physical and virtual IOMMU drivers. For baremetal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) usage, the IOMMU is a system device which does not need to communicate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) with userspace directly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) The primary use cases are guest Shared Virtual Address (SVA) and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) guest IO virtual address (IOVA), wherein the vIOMMU implementation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) relies on the physical IOMMU and for this reason requires interactions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) with the host driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) .. contents:: :local:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) Functionalities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) Communications of user and kernel involve both directions. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) supported user-kernel APIs are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 1. Bind/Unbind guest PASID (e.g. Intel VT-d)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 2. Bind/Unbind guest PASID table (e.g. ARM SMMU)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 3. Invalidate IOMMU caches upon guest requests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 4. Report errors to the guest and serve page requests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) Requirements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) The IOMMU UAPIs are generic and extensible to meet the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) requirements:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 1. Emulated and para-virtualised vIOMMUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 2. Multiple vendors (Intel VT-d, ARM SMMU, etc.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 3. Extensions to the UAPI shall not break existing userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) Interfaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) Although the data structures defined in IOMMU UAPI are self-contained,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) there are no user API functions introduced. Instead, IOMMU UAPI is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) designed to work with existing user driver frameworks such as VFIO.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) Extension Rules & Precautions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) -----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) When IOMMU UAPI gets extended, the data structures can *only* be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) modified in two ways:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 1. Adding new fields by re-purposing the padding[] field. No size change.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 2. Adding new union members at the end. May increase the structure sizes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) No new fields can be added *after* the variable sized union in that it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) will break backward compatibility when offset moves. A new flag must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) be introduced whenever a change affects the structure using either
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) method. The IOMMU driver processes the data based on flags which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) ensures backward compatibility.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) Version field is only reserved for the unlikely event of UAPI upgrade
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) at its entirety.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) It's *always* the caller's responsibility to indicate the size of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) structure passed by setting argsz appropriately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) Though at the same time, argsz is user provided data which is not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) trusted. The argsz field allows the user app to indicate how much data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) it is providing; it's still the kernel's responsibility to validate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) whether it's correct and sufficient for the requested operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) Compatibility Checking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) When IOMMU UAPI extension results in some structure size increase,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) IOMMU UAPI code shall handle the following cases:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 1. User and kernel has exact size match
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 2. An older user with older kernel header (smaller UAPI size) running on a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76)    newer kernel (larger UAPI size)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 3. A newer user with newer kernel header (larger UAPI size) running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78)    on an older kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 4. A malicious/misbehaving user passing illegal/invalid size but within
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80)    range. The data may contain garbage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) Feature Checking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) While launching a guest with vIOMMU, it is strongly advised to check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) the compatibility upfront, as some subsequent errors happening during
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) vIOMMU operation, such as cache invalidation failures cannot be nicely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) escalated to the guest due to IOMMU specifications. This can lead to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) catastrophic failures for the users.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) User applications such as QEMU are expected to import kernel UAPI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) headers. Backward compatibility is supported per feature flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) For example, an older QEMU (with older kernel header) can run on newer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) kernel. Newer QEMU (with new kernel header) may refuse to initialize
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) on an older kernel if new feature flags are not supported by older
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) kernel. Simply recompiling existing code with newer kernel header should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) not be an issue in that only existing flags are used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) IOMMU vendor driver should report the below features to IOMMU UAPI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) consumers (e.g. via VFIO).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 1. IOMMU_NESTING_FEAT_SYSWIDE_PASID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 2. IOMMU_NESTING_FEAT_BIND_PGTBL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 3. IOMMU_NESTING_FEAT_BIND_PASID_TABLE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 4. IOMMU_NESTING_FEAT_CACHE_INVLD
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 5. IOMMU_NESTING_FEAT_PAGE_REQUEST
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) Take VFIO as example, upon request from VFIO userspace (e.g. QEMU),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) VFIO kernel code shall query IOMMU vendor driver for the support of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) the above features. Query result can then be reported back to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) userspace caller. Details can be found in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) Documentation/driver-api/vfio.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) Data Passing Example with VFIO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) ------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) As the ubiquitous userspace driver framework, VFIO is already IOMMU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) aware and shares many key concepts such as device model, group, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) protection domain. Other user driver frameworks can also be extended
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) to support IOMMU UAPI but it is outside the scope of this document.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) In this tight-knit VFIO-IOMMU interface, the ultimate consumer of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) IOMMU UAPI data is the host IOMMU driver. VFIO facilitates user-kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) transport, capability checking, security, and life cycle management of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) process address space ID (PASID).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) VFIO layer conveys the data structures down to the IOMMU driver. It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) follows the pattern below::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)    struct {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 	__u32 argsz;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 	__u32 flags;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 	__u8  data[];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)    };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) Here data[] contains the IOMMU UAPI data structures. VFIO has the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) freedom to bundle the data as well as parse data size based on its own flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) In order to determine the size and feature set of the user data, argsz
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) and flags (or the equivalent) are also embedded in the IOMMU UAPI data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) structures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) A "__u32 argsz" field is *always* at the beginning of each structure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) For example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)    struct iommu_cache_invalidate_info {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 	__u32	argsz;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 	#define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 	__u32	version;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 	/* IOMMU paging structure cache */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 	#define IOMMU_CACHE_INV_TYPE_IOTLB	(1 << 0) /* IOMMU IOTLB */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 	#define IOMMU_CACHE_INV_TYPE_DEV_IOTLB	(1 << 1) /* Device IOTLB */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 	#define IOMMU_CACHE_INV_TYPE_PASID	(1 << 2) /* PASID cache */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 	#define IOMMU_CACHE_INV_TYPE_NR		(3)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 	__u8	cache;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 	__u8	granularity;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 	__u8	padding[6];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 	union {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 		struct iommu_inv_pasid_info pasid_info;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 		struct iommu_inv_addr_info addr_info;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 	} granu;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)    };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) VFIO is responsible for checking its own argsz and flags. It then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) invokes appropriate IOMMU UAPI functions. The user pointers are passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) to the IOMMU layer for further processing. The responsibilities are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) divided as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) - Generic IOMMU layer checks argsz range based on UAPI data in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)   current kernel version.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) - Generic IOMMU layer checks content of the UAPI data for non-zero
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)   reserved bits in flags, padding fields, and unsupported version.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)   This is to ensure not breaking userspace in the future when these
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)   fields or flags are used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) - Vendor IOMMU driver checks argsz based on vendor flags. UAPI data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)   is consumed based on flags. Vendor driver has access to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)   unadulterated argsz value in case of vendor specific future
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)   extensions. Currently, it does not perform the copy_from_user()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182)   itself. A __user pointer can be provided in some future scenarios
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183)   where there's vendor data outside of the structure definition.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) IOMMU code treats UAPI data in two categories:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) - structure contains vendor data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188)   (Example: iommu_uapi_cache_invalidate())
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) - structure contains only generic data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)   (Example: iommu_uapi_sva_bind_gpasid())
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) Sharing UAPI with in-kernel users
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) ---------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) For UAPIs that are shared with in-kernel users, a wrapper function is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) provided to distinguish the callers. For example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) Userspace caller ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202)   int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203)                                    struct device *dev,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)                                    void __user *udata)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) In-kernel caller ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208)   int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209)                               struct device *dev, ioasid_t ioasid);