^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) OpenCAPI (Open Coherent Accelerator Processor Interface)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) OpenCAPI is an interface between processors and accelerators. It aims
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) at being low-latency and high-bandwidth. The specification is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) developed by the `OpenCAPI Consortium <http://opencapi.org/>`_.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) It allows an accelerator (which could be a FPGA, ASICs, ...) to access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) the host memory coherently, using virtual addresses. An OpenCAPI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) device can also host its own memory, that can be accessed from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) host.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) OpenCAPI is known in linux as 'ocxl', as the open, processor-agnostic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) evolution of 'cxl' (the driver for the IBM CAPI interface for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) powerpc), which was named that way to avoid confusion with the ISDN
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) CAPI subsystem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) High-level view
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) OpenCAPI defines a Data Link Layer (DL) and Transaction Layer (TL), to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) be implemented on top of a physical link. Any processor or device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) implementing the DL and TL can start sharing memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) +-----------+ +-------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) | | | Accelerated |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) | Processor | | Function |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) | | +--------+ | Unit | +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) | |--| Memory | | (AFU) |--| Memory |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) | | +--------+ | | +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) +-----------+ +-------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) +-----------+ +-------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) | TL | | TLX |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) +-----------+ +-------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) +-----------+ +-------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) | DL | | DLX |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) +-----------+ +-------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) | PHY |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) +---------------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) Device discovery
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) OpenCAPI relies on a PCI-like configuration space, implemented on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) device. So the host can discover AFUs by querying the config space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) OpenCAPI devices in Linux are treated like PCI devices (with a few
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) caveats). The firmware is expected to abstract the hardware as if it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) was a PCI link. A lot of the existing PCI infrastructure is reused:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) devices are scanned and BARs are assigned during the standard PCI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) enumeration. Commands like 'lspci' can therefore be used to see what
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) devices are available.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) The configuration space defines the AFU(s) that can be found on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) physical adapter, such as its name, how many memory contexts it can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) work with, the size of its MMIO areas, ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) MMIO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) ====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) OpenCAPI defines two MMIO areas for each AFU:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) * the global MMIO area, with registers pertinent to the whole AFU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) * a per-process MMIO area, which has a fixed size for each context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) AFU interrupts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) OpenCAPI includes the possibility for an AFU to send an interrupt to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) host process. It is done through a 'intrp_req' defined in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) Transaction Layer, specifying a 64-bit object handle which defines the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) interrupt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) The driver allows a process to allocate an interrupt and obtain its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) 64-bit object handle, that can be passed to the AFU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) char devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) The driver creates one char device per AFU found on the physical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) device. A physical device may have multiple functions and each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) function can have multiple AFUs. At the time of this writing though,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) it has only been tested with devices exporting only one AFU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) Char devices can be found in /dev/ocxl/ and are named as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) /dev/ocxl/<AFU name>.<location>.<index>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) where <AFU name> is a max 20-character long name, as found in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) config space of the AFU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) <location> is added by the driver and can help distinguish devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) when a system has more than one instance of the same OpenCAPI device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) <index> is also to help distinguish AFUs in the unlikely case where a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) device carries multiple copies of the same AFU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) Sysfs class
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) An ocxl class is added for the devices representing the AFUs. See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) /sys/class/ocxl. The layout is described in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) Documentation/ABI/testing/sysfs-class-ocxl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) User API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) open
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) ----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) Based on the AFU definition found in the config space, an AFU may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) support working with more than one memory context, in which case the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) associated char device may be opened multiple times by different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) processes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) ioctl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) OCXL_IOCTL_ATTACH:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) Attach the memory context of the calling process to the AFU so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) the AFU can access its memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) OCXL_IOCTL_IRQ_ALLOC:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) Allocate an AFU interrupt and return an identifier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) OCXL_IOCTL_IRQ_FREE:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) Free a previously allocated AFU interrupt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) OCXL_IOCTL_IRQ_SET_FD:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) Associate an event fd to an AFU interrupt so that the user process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) can be notified when the AFU sends an interrupt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) OCXL_IOCTL_GET_METADATA:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) Obtains configuration information from the card, such at the size of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) MMIO areas, the AFU version, and the PASID for the current context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) OCXL_IOCTL_ENABLE_P9_WAIT:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) Allows the AFU to wake a userspace thread executing 'wait'. Returns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) information to userspace to allow it to configure the AFU. Note that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) this is only available on POWER9.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) OCXL_IOCTL_GET_FEATURES:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) Reports on which CPU features that affect OpenCAPI are usable from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) userspace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) mmap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) ----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) A process can mmap the per-process MMIO area for interactions with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) AFU.