^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) Introduction of Uacce
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) Uacce (Unified/User-space-access-intended Accelerator Framework) targets to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) provide Shared Virtual Addressing (SVA) between accelerators and processes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) So accelerator can access any data structure of the main cpu.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) This differs from the data sharing between cpu and io device, which share
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) only data content rather than address.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) Because of the unified address, hardware and user space of process can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) share the same virtual address in the communication.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) Uacce takes the hardware accelerator as a heterogeneous processor, while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) IOMMU share the same CPU page tables and as a result the same translation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) from va to pa.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) __________________________ __________________________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) | User application (CPU) | | Hardware Accelerator |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) |__________________________| |__________________________|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) | va | va
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) V V
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) __________ __________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) | MMU | | IOMMU |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) |__________| |__________|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) V pa V pa
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) _______________________________________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) | Memory |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) |_______________________________________|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) Architecture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) Uacce is the kernel module, taking charge of iommu and address sharing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) The user drivers and libraries are called WarpDrive.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) The uacce device, built around the IOMMU SVA API, can access multiple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) address spaces, including the one without PASID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) A virtual concept, queue, is used for the communication. It provides a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) FIFO-like interface. And it maintains a unified address space between the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) application and all involved hardware.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) ___________________ ________________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) | | user API | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) | WarpDrive library | ------------> | user driver |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) |___________________| |________________|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) | queue fd |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) v |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) ___________________ _________ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) | | | | | mmap memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) | Other framework | | uacce | | r/w interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) | crypto/nic/others | |_________| |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) |___________________| |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) | register | register |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) | _________________ __________ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) ------------- | Device Driver | | IOMMU | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) |_________________| |__________| |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) | V
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) | ___________________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) -------------------------- | Device(Hardware) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) |___________________|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) How does it work
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) Uacce uses mmap and IOMMU to play the trick.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) Uacce creates a chrdev for every device registered to it. New queue is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) created when user application open the chrdev. The file descriptor is used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) as the user handle of the queue.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) The accelerator device present itself as an Uacce object, which exports as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) a chrdev to the user space. The user application communicates with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) hardware by ioctl (as control path) or share memory (as data path).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) The control path to the hardware is via file operation, while data path is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) via mmap space of the queue fd.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) The queue file address space:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) /**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) * enum uacce_qfrt: qfrt type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) * @UACCE_QFRT_MMIO: device mmio region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) * @UACCE_QFRT_DUS: device user share region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) enum uacce_qfrt {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) UACCE_QFRT_MMIO = 0,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) UACCE_QFRT_DUS = 1,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) All regions are optional and differ from device type to type.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) Each region can be mmapped only once, otherwise -EEXIST returns.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) The device mmio region is mapped to the hardware mmio space. It is generally
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) used for doorbell or other notification to the hardware. It is not fast enough
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) as data channel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) The device user share region is used for share data buffer between user process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) and device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) The Uacce register API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) The register API is defined in uacce.h.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) struct uacce_interface {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) char name[UACCE_MAX_NAME_SIZE];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) unsigned int flags;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) const struct uacce_ops *ops;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) According to the IOMMU capability, uacce_interface flags can be:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) /**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) * UACCE Device flags:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) * UACCE_DEV_SVA: Shared Virtual Addresses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) * Support PASID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) * Support device page faults (PCI PRI or SMMU Stall)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) #define UACCE_DEV_SVA BIT(0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) struct uacce_device *uacce_alloc(struct device *parent,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) struct uacce_interface *interface);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) int uacce_register(struct uacce_device *uacce);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) void uacce_remove(struct uacce_device *uacce);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) uacce_register results can be:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) a. If uacce module is not compiled, ERR_PTR(-ENODEV)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) b. Succeed with the desired flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) c. Succeed with the negotiated flags, for example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) uacce_interface.flags = UACCE_DEV_SVA but uacce->flags = ~UACCE_DEV_SVA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) So user driver need check return value as well as the negotiated uacce->flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) The user driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) The queue file mmap space will need a user driver to wrap the communication
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) protocol. Uacce provides some attributes in sysfs for the user driver to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) match the right accelerator accordingly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) More details in Documentation/ABI/testing/sysfs-driver-uacce.