^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. include:: <isonum.txt>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) VFIO Mediated devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) :Copyright: |copy| 2016, NVIDIA CORPORATION. All rights reserved.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) :Author: Neo Jia <cjia@nvidia.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) :Author: Kirti Wankhede <kwankhede@nvidia.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) This program is free software; you can redistribute it and/or modify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) it under the terms of the GNU General Public License version 2 as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) published by the Free Software Foundation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) Virtual Function I/O (VFIO) Mediated devices[1]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) ===============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) The number of use cases for virtualizing DMA devices that do not have built-in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) SR_IOV capability is increasing. Previously, to virtualize such devices,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) developers had to create their own management interfaces and APIs, and then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) integrate them with user space software. To simplify integration with user space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) software, we have identified common requirements and a unified management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) interface for such devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) The VFIO driver framework provides unified APIs for direct device access. It is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) an IOMMU/device-agnostic framework for exposing direct device access to user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) space in a secure, IOMMU-protected environment. This framework is used for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) multiple devices, such as GPUs, network adapters, and compute accelerators. With
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) direct device access, virtual machines or user space applications have direct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) access to the physical device. This framework is reused for mediated devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) The mediated core driver provides a common interface for mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) management that can be used by drivers of different devices. This module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) provides a generic interface to perform these operations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) * Create and destroy a mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) * Add a mediated device to and remove it from a mediated bus driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) * Add a mediated device to and remove it from an IOMMU group
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) The mediated core driver also provides an interface to register a bus driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) For example, the mediated VFIO mdev driver is designed for mediated devices and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) supports VFIO APIs. The mediated bus driver adds a mediated device to and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) removes it from a VFIO group.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) The following high-level block diagram shows the main components and interfaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and IBM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) devices as examples, as these devices are the first devices to use this module::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) | +-----------+ | mdev_register_driver() +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) | | | +<------------------------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) | | mdev | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) | | bus | +------------------------>+ vfio_mdev.ko |<-> VFIO user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) | | driver | | probe()/remove() | | APIs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) | | | | +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) | +-----------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) | MDEV CORE |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) | MODULE |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) | mdev.ko |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) | +-----------+ | mdev_register_device() +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) | | | +<------------------------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) | | | | | nvidia.ko |<-> physical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) | | | +------------------------>+ | device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) | | | | callbacks +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) | | Physical | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) | | device | | mdev_register_device() +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) | | interface | |<------------------------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) | | | | | i915.ko |<-> physical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) | | | +------------------------>+ | device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) | | | | callbacks +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) | | | | mdev_register_device() +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) | | | +<------------------------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) | | | | | ccw_device.ko|<-> physical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) | | | +------------------------>+ | device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) | | | | callbacks +--------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) | +-----------+ |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) Registration Interfaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) The mediated core driver provides the following types of registration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) interfaces:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) * Registration interface for a mediated bus driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) * Physical device driver interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) Registration Interface for a Mediated Bus Driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) ------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) The registration interface for a mediated bus driver provides the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) structure to represent a mediated device's driver::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) /*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) * struct mdev_driver [2] - Mediated device's driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) * @name: driver name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) * @probe: called when new device created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) * @remove: called when device removed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) * @driver: device driver structure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) struct mdev_driver {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) const char *name;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) int (*probe) (struct device *dev);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) void (*remove) (struct device *dev);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) struct device_driver driver;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) A mediated bus driver for mdev should use this structure in the function calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) to register and unregister itself with the core driver:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) * Register::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) extern int mdev_register_driver(struct mdev_driver *drv,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) struct module *owner);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) * Unregister::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) extern void mdev_unregister_driver(struct mdev_driver *drv);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) The mediated bus driver is responsible for adding mediated devices to the VFIO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) group when devices are bound to the driver and removing mediated devices from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) the VFIO when devices are unbound from the driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) Physical Device Driver Interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) --------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) The physical device driver interface provides the mdev_parent_ops[3] structure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) to define the APIs to manage work in the mediated core driver that is related
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) to the physical device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) The structures in the mdev_parent_ops structure are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) * dev_attr_groups: attributes of the parent device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) * mdev_attr_groups: attributes of the mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) * supported_config: attributes to define supported configurations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) The functions in the mdev_parent_ops structure are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) * create: allocate basic resources in a driver for a mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) * remove: free resources in a driver when a mediated device is destroyed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) (Note that mdev-core provides no implicit serialization of create/remove
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) callbacks per mdev parent device, per mdev type, or any other categorization.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) Vendor drivers are expected to be fully asynchronous in this respect or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) provide their own internal resource protection.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) The callbacks in the mdev_parent_ops structure are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) * open: open callback of mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) * close: close callback of mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) * ioctl: ioctl callback of mediated device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) * read : read emulation callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) * write: write emulation callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) * mmap: mmap emulation callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) A driver should use the mdev_parent_ops structure in the function call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) register itself with the mdev core driver::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) extern int mdev_register_device(struct device *dev,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) const struct mdev_parent_ops *ops);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) However, the mdev_parent_ops structure is not required in the function call
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) that a driver should use to unregister itself with the mdev core driver::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) extern void mdev_unregister_device(struct device *dev);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) Mediated Device Management Interface Through sysfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) ==================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) The management interface through sysfs enables user space software, such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) libvirt, to query and configure mediated devices in a hardware-agnostic fashion.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) This management interface provides flexibility to the underlying physical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) device's driver to support features such as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) * Mediated device hot plug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) * Multiple mediated devices in a single virtual machine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) * Multiple mediated devices from different physical devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) Links in the mdev_bus Class Directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) -------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) The /sys/class/mdev_bus/ directory contains links to devices that are registered
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) with the mdev core driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) Directories and files under the sysfs for Each Physical Device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) --------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) |- [parent physical device]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) |--- Vendor-specific-attributes [optional]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) |--- [mdev_supported_types]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) | |--- [<type-id>]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) | | |--- create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) | | |--- name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) | | |--- available_instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) | | |--- device_api
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) | | |--- description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) | | |--- [devices]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) | |--- [<type-id>]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) | | |--- create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) | | |--- name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) | | |--- available_instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) | | |--- device_api
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) | | |--- description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) | | |--- [devices]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) | |--- [<type-id>]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) | |--- create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) | |--- name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) | |--- available_instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) | |--- device_api
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) | |--- description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) | |--- [devices]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) * [mdev_supported_types]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) The list of currently supported mediated device types and their details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) [<type-id>], device_api, and available_instances are mandatory attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) that should be provided by vendor driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) * [<type-id>]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) The [<type-id>] name is created by adding the device driver string as a prefix
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) to the string provided by the vendor driver. This format of this name is as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) follows::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) sprintf(buf, "%s-%s", dev_driver_string(parent->dev), group->name);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) (or using mdev_parent_dev(mdev) to arrive at the parent device outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) of the core mdev code)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) * device_api
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) This attribute should show which device API is being created, for example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) "vfio-pci" for a PCI device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) * available_instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) This attribute should show the number of devices of type <type-id> that can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) created.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) * [device]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) This directory contains links to the devices of type <type-id> that have been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) created.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) * name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) This attribute should show human readable name. This is optional attribute.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) * description
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) This attribute should show brief features/description of the type. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) optional attribute.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) Directories and Files Under the sysfs for Each mdev Device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) ----------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) |- [parent phy device]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) |--- [$MDEV_UUID]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) |--- remove
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) |--- mdev_type {link to its type}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) |--- vendor-specific-attributes [optional]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) * remove (write only)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) Writing '1' to the 'remove' file destroys the mdev device. The vendor driver can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) fail the remove() callback if that device is active and the vendor driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) doesn't support hot unplug.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) Example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) # echo 1 > /sys/bus/mdev/devices/$mdev_UUID/remove
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) Mediated device Hot plug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) Mediated devices can be created and assigned at runtime. The procedure to hot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) plug a mediated device is the same as the procedure to hot plug a PCI device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) Translation APIs for Mediated Devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) =====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) The following APIs are provided for translating user pfn to host pfn in a VFIO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) driver::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) extern int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) int npage, int prot, unsigned long *phys_pfn);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) extern int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) int npage);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) These functions call back into the back-end IOMMU module by using the pin_pages
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) and unpin_pages callbacks of the struct vfio_iommu_driver_ops[4]. Currently
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) these callbacks are supported in the TYPE1 IOMMU module. To enable them for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) these two callback functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) Using the Sample Code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) mtty.c in samples/vfio-mdev/ directory is a sample driver program to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) demonstrate how to use the mediated device framework.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) The sample driver creates an mdev device that simulates a serial port over a PCI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) card.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) 1. Build and load the mtty.ko module.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) Files in this device directory in sysfs are similar to the following::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) # tree /sys/devices/virtual/mtty/mtty/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) /sys/devices/virtual/mtty/mtty/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) |-- mdev_supported_types
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) | |-- mtty-1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) | | |-- available_instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) | | |-- create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) | | |-- device_api
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) | | |-- devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) | | `-- name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) | `-- mtty-2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) | |-- available_instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) | |-- create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) | |-- device_api
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) | |-- devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) | `-- name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) |-- mtty_dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) | `-- sample_mtty_dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) |-- power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) | |-- autosuspend_delay_ms
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) | |-- control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) | |-- runtime_active_time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) | |-- runtime_status
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) | `-- runtime_suspended_time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) |-- subsystem -> ../../../../class/mtty
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) `-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 2. Create a mediated device by using the dummy device that you created in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) previous step::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) 3. Add parameters to qemu-kvm::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) -device vfio-pci,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 4. Boot the VM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) In the Linux guest VM, with no hardware on the host, the device appears
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) as follows::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) # lspci -s 00:05.0 -xxvv
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) Subsystem: Device 4348:3253
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) Physical Slot: 5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) Stepping- SERR- FastB2B- DisINTx-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) <TAbort- <MAbort- >SERR- <PERR- INTx-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) Interrupt: pin A routed to IRQ 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) Region 0: I/O ports at c150 [size=8]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) Region 1: I/O ports at c158 [size=8]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) Kernel driver in use: serial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) 00: 48 43 53 32 01 00 00 02 10 02 00 07 00 00 00 00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) 10: 51 c1 00 00 59 c1 00 00 00 00 00 00 00 00 00 00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) 20: 00 00 00 00 00 00 00 00 00 00 00 00 48 43 53 32
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) 30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) In the Linux guest VM, dmesg output for the device is as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) serial 0000:00:05.0: PCI INT A -> Link[LNKA] -> GSI 10 (level, high) -> IRQ 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) 0000:00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) 0000:00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) 5. In the Linux guest VM, check the serial ports::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) # setserial -g /dev/ttyS*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) /dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) /dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) 6. Using minicom or any terminal emulation program, open port /dev/ttyS1 or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) /dev/ttyS2 with hardware flow control disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) 7. Type data on the minicom terminal or send data to the terminal emulation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) program and read the data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) Data is loop backed from hosts mtty driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) 8. Destroy the mediated device that you created::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) # echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) References
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) 1. See Documentation/driver-api/vfio.rst for more information on VFIO.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) 2. struct mdev_driver in include/linux/mdev.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) 3. struct mdev_parent_ops in include/linux/mdev.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) 4. struct vfio_iommu_driver_ops in include/linux/vfio.h