^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) LIBNVDIMM: Non-Volatile Devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) libnvdimm - kernel / libndctl - userspace helper library
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) linux-nvdimm@lists.01.org
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) Version 13
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) .. contents:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) Glossary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) Supporting Documents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) Git Trees
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) LIBNVDIMM PMEM and BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) Why BLK?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) PMEM vs BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) BLK-REGIONs, PMEM-REGIONs, Atomic Sectors, and DAX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) Example NVDIMM Platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) LIBNVDIMM Kernel Device Model and LIBNDCTL Userspace API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) LIBNDCTL: Context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) libndctl: instantiate a new library context example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) LIBNVDIMM/LIBNDCTL: Bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) libnvdimm: control class device in /sys/class
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) libnvdimm: bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) libndctl: bus enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) LIBNVDIMM/LIBNDCTL: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) libnvdimm: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) libndctl: DIMM enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) LIBNVDIMM/LIBNDCTL: Region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) libnvdimm: region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) libndctl: region enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) Why Not Encode the Region Type into the Region Name?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) How Do I Determine the Major Type of a Region?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) LIBNVDIMM/LIBNDCTL: Namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) libnvdimm: namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) libndctl: namespace enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) libndctl: namespace creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) Why the Term "namespace"?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) LIBNVDIMM/LIBNDCTL: Block Translation Table "btt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) libnvdimm: btt layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) libndctl: btt creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) Summary LIBNDCTL Diagram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) Glossary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) PMEM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) A system-physical-address range where writes are persistent. A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) block device composed of PMEM is capable of DAX. A PMEM address range
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) may span an interleave of several DIMMs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) BLK:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) A set of one or more programmable memory mapped apertures provided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) by a DIMM to access its media. This indirection precludes the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) performance benefit of interleaving, but enables DIMM-bounded failure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) modes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) DPA:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) DIMM Physical Address, is a DIMM-relative offset. With one DIMM in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) the system there would be a 1:1 system-physical-address:DPA association.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) Once more DIMMs are added a memory controller interleave must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) decoded to determine the DPA associated with a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) system-physical-address. BLK capacity always has a 1:1 relationship
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) with a single-DIMM's DPA range.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) DAX:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) File system extensions to bypass the page cache and block layer to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) mmap persistent memory, from a PMEM block device, directly into a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) process address space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) DSM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) Device Specific Method: ACPI method to control specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) device - in this case the firmware.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) DCR:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) NVDIMM Control Region Structure defined in ACPI 6 Section 5.2.25.5.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) It defines a vendor-id, device-id, and interface format for a given DIMM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) BTT:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) Block Translation Table: Persistent memory is byte addressable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) Existing software may have an expectation that the power-fail-atomicity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) of writes is at least one sector, 512 bytes. The BTT is an indirection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) table with atomic update semantics to front a PMEM/BLK block device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) driver and present arbitrary atomic sector sizes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) LABEL:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) Metadata stored on a DIMM device that partitions and identifies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) (persistently names) storage between PMEM and BLK. It also partitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) BLK storage to host BTTs with different parameters per BLK-partition.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) Note that traditional partition tables, GPT/MBR, are layered on top of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) BLK or PMEM device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) The LIBNVDIMM subsystem provides support for three types of NVDIMMs, namely,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) PMEM, BLK, and NVDIMM devices that can simultaneously support both PMEM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) and BLK mode access. These three modes of operation are described by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) the "NVDIMM Firmware Interface Table" (NFIT) in ACPI 6. While the LIBNVDIMM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) implementation is generic and supports pre-NFIT platforms, it was guided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) by the superset of capabilities need to support this ACPI 6 definition
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) for NVDIMM resources. The bulk of the kernel implementation is in place
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) to handle the case where DPA accessible via PMEM is aliased with DPA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) accessible via BLK. When that occurs a LABEL is needed to reserve DPA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) for exclusive access via one mode a time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) Supporting Documents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) ACPI 6:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) https://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) NVDIMM Namespace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) https://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) DSM Interface Example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) https://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) Driver Writer's Guide:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) https://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Git Trees
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) LIBNVDIMM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) https://git.kernel.org/cgit/linux/kernel/git/djbw/nvdimm.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) LIBNDCTL:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) https://github.com/pmem/ndctl.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) PMEM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) https://github.com/01org/prd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) LIBNVDIMM PMEM and BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) ======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) Prior to the arrival of the NFIT, non-volatile memory was described to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) system in various ad-hoc ways. Usually only the bare minimum was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) provided, namely, a single system-physical-address range where writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) are expected to be durable after a system power loss. Now, the NFIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) specification standardizes not only the description of PMEM, but also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) BLK and platform message-passing entry points for control and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) configuration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) For each NVDIMM access method (PMEM, BLK), LIBNVDIMM provides a block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) device driver:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 1. PMEM (nd_pmem.ko): Drives a system-physical-address range. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) range is contiguous in system memory and may be interleaved (hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) memory controller striped) across multiple DIMMs. When interleaved the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) platform may optionally provide details of which DIMMs are participating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) in the interleave.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) Note that while LIBNVDIMM describes system-physical-address ranges that may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) alias with BLK access as ND_NAMESPACE_PMEM ranges and those without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) alias as ND_NAMESPACE_IO ranges, to the nd_pmem driver there is no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) distinction. The different device-types are an implementation detail
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) that userspace can exploit to implement policies like "only interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) with address ranges from certain DIMMs". It is worth noting that when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) aliasing is present and a DIMM lacks a label, then no block device can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) be created by default as userspace needs to do at least one allocation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) of DPA to the PMEM range. In contrast ND_NAMESPACE_IO ranges, once
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) registered, can be immediately attached to nd_pmem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 2. BLK (nd_blk.ko): This driver performs I/O using a set of platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) defined apertures. A set of apertures will access just one DIMM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) Multiple windows (apertures) allow multiple concurrent accesses, much like
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) tagged-command-queuing, and would likely be used by different threads or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) different CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) The NFIT specification defines a standard format for a BLK-aperture, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) the spec also allows for vendor specific layouts, and non-NFIT BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) implementations may have other designs for BLK I/O. For this reason
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) "nd_blk" calls back into platform-specific code to perform the I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) One such implementation is defined in the "Driver Writer's Guide" and "DSM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) Interface Example".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) Why BLK?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) While PMEM provides direct byte-addressable CPU-load/store access to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) NVDIMM storage, it does not provide the best system RAS (recovery,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) availability, and serviceability) model. An access to a corrupted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) system-physical-address address causes a CPU exception while an access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) to a corrupted address through an BLK-aperture causes that block window
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) to raise an error status in a register. The latter is more aligned with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) the standard error model that host-bus-adapter attached disks present.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) Also, if an administrator ever wants to replace a memory it is easier to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) service a system at DIMM module boundaries. Compare this to PMEM where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) data could be interleaved in an opaque hardware specific manner across
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) several DIMMs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) PMEM vs BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) BLK-apertures solve these RAS problems, but their presence is also the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) major contributing factor to the complexity of the ND subsystem. They
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) complicate the implementation because PMEM and BLK alias in DPA space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) Any given DIMM's DPA-range may contribute to one or more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) system-physical-address sets of interleaved DIMMs, *and* may also be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) accessed in its entirety through its BLK-aperture. Accessing a DPA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) through a system-physical-address while simultaneously accessing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) same DPA through a BLK-aperture has undefined results. For this reason,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) DIMMs with this dual interface configuration include a DSM function to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) store/retrieve a LABEL. The LABEL effectively partitions the DPA-space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) into exclusive system-physical-address and BLK-aperture accessible
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) regions. For simplicity a DIMM is allowed a PMEM "region" per each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) interleave set in which it is a member. The remaining DPA space can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) carved into an arbitrary number of BLK devices with discontiguous
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) extents.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) BLK-REGIONs, PMEM-REGIONs, Atomic Sectors, and DAX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) One of the few
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) reasons to allow multiple BLK namespaces per REGION is so that each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) BLK-namespace can be configured with a BTT with unique atomic sector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) sizes. While a PMEM device can host a BTT the LABEL specification does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) not provide for a sector size to be specified for a PMEM namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) This is due to the expectation that the primary usage model for PMEM is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) via DAX, and the BTT is incompatible with DAX. However, for the cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) where an application or filesystem still needs atomic sector update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) guarantees it can register a BTT on a PMEM device or partition. See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) LIBNVDIMM/NDCTL: Block Translation Table "btt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) Example NVDIMM Platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) For the remainder of this document the following diagram will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) referenced for any example sysfs layouts::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) (a) (b) DIMM BLK-REGION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) +-------------------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) +------+ | pm0.0 | blk2.0 | pm1.0 | blk2.1 | 0 region2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) | imc0 +--+- - - region0- - - +--------+ +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) +--+---+ | pm0.0 | blk3.0 | pm1.0 | blk3.1 | 1 region3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) | +-------------------+--------v v--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) +--+---+ | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) | cpu0 | region1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) +--+---+ | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) | +----------------------------^ ^--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) +--+---+ | blk4.0 | pm1.0 | blk4.0 | 2 region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) | imc1 +--+----------------------------| +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) +------+ | blk5.0 | pm1.0 | blk5.0 | 3 region5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) +----------------------------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) In this platform we have four DIMMs and two memory controllers in one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) socket. Each unique interface (BLK or PMEM) to DPA space is identified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) by a region device with a dynamically assigned id (REGION0 - REGION5).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) 1. The first portion of DIMM0 and DIMM1 are interleaved as REGION0. A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) single PMEM namespace is created in the REGION0-SPA-range that spans most
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) of DIMM0 and DIMM1 with a user-specified name of "pm0.0". Some of that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) interleaved system-physical-address range is reclaimed as BLK-aperture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) accessed space starting at DPA-offset (a) into each DIMM. In that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) reclaimed space we create two BLK-aperture "namespaces" from REGION2 and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) REGION3 where "blk2.0" and "blk3.0" are just human readable names that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) could be set to any user-desired name in the LABEL.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 2. In the last portion of DIMM0 and DIMM1 we have an interleaved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) system-physical-address range, REGION1, that spans those two DIMMs as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) well as DIMM2 and DIMM3. Some of REGION1 is allocated to a PMEM namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) named "pm1.0", the rest is reclaimed in 4 BLK-aperture namespaces (for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) each DIMM in the interleave set), "blk2.1", "blk3.1", "blk4.0", and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) "blk5.0".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) 3. The portion of DIMM2 and DIMM3 that do not participate in the REGION1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) interleaved system-physical-address range (i.e. the DPA address past
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) offset (b) are also included in the "blk4.0" and "blk5.0" namespaces.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) Note, that this example shows that BLK-aperture namespaces don't need to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) be contiguous in DPA-space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) This bus is provided by the kernel under the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) /sys/devices/platform/nfit_test.0 when the nfit_test.ko module from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) tools/testing/nvdimm is loaded. This not only test LIBNVDIMM but the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) acpi_nfit.ko driver as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) LIBNVDIMM Kernel Device Model and LIBNDCTL Userspace API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) ========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) What follows is a description of the LIBNVDIMM sysfs layout and a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) corresponding object hierarchy diagram as viewed through the LIBNDCTL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) API. The example sysfs paths and diagrams are relative to the Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) NVDIMM Platform which is also the LIBNVDIMM bus used in the LIBNDCTL unit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) LIBNDCTL: Context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) Every API call in the LIBNDCTL library requires a context that holds the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) logging parameters and other library instance state. The library is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) based on the libabc template:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) https://git.kernel.org/cgit/linux/kernel/git/kay/libabc.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) LIBNDCTL: instantiate a new library context example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) struct ndctl_ctx *ctx;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) if (ndctl_new(&ctx) == 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) return ctx;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) LIBNVDIMM/LIBNDCTL: Bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) A bus has a 1:1 relationship with an NFIT. The current expectation for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) ACPI based systems is that there is only ever one platform-global NFIT.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) That said, it is trivial to register multiple NFITs, the specification
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) does not preclude it. The infrastructure supports multiple busses and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) we use this capability to test multiple NFIT configurations in the unit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) LIBNVDIMM: control class device in /sys/class
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) ---------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) This character device accepts DSM messages to be passed to DIMM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) identified by its NFIT handle::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) /sys/class/nd/ndctl0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) |-- dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) |-- device -> ../../../ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) |-- subsystem -> ../../../../../../../class/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) LIBNVDIMM: bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) struct nvdimm_bus_descriptor *nfit_desc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) /sys/devices/platform/nfit_test.0/ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) |-- commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) |-- nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) |-- nfit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) |-- nmem0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) |-- nmem1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) |-- nmem2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) |-- nmem3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) |-- power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) |-- provider
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) |-- region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) |-- region1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) |-- region2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) |-- region3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) |-- region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) |-- region5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) |-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) `-- wait_probe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) LIBNDCTL: bus enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) Find the bus handle that describes the bus from Example NVDIMM Platform::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) static struct ndctl_bus *get_bus_by_provider(struct ndctl_ctx *ctx,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) const char *provider)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) struct ndctl_bus *bus;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) ndctl_bus_foreach(ctx, bus)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) if (strcmp(provider, ndctl_bus_get_provider(bus)) == 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) return bus;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) bus = get_bus_by_provider(ctx, "nfit_test.0");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) LIBNVDIMM/LIBNDCTL: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) -------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) The DIMM device provides a character device for sending commands to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) hardware, and it is a container for LABELs. If the DIMM is defined by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) NFIT then an optional 'nfit' attribute sub-directory is available to add
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) NFIT-specifics.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) Note that the kernel device name for "DIMMs" is "nmemX". The NFIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) describes these devices via "Memory Device to System Physical Address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) Range Mapping Structure", and there is no requirement that they actually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) be physical DIMMs, so we use a more generic name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) LIBNVDIMM: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) ^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) struct nvdimm *nvdimm_create(struct nvdimm_bus *nvdimm_bus, void *provider_data,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) const struct attribute_group **groups, unsigned long flags,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) unsigned long *dsm_mask);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) /sys/devices/platform/nfit_test.0/ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) |-- nmem0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) | |-- available_slots
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) | |-- commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) | |-- dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) | |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) | |-- driver -> ../../../../../bus/nd/drivers/nvdimm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) | |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) | |-- nfit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) | | |-- device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) | | |-- format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) | | |-- handle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) | | |-- phys_id
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) | | |-- rev_id
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) | | |-- serial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) | | `-- vendor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) | |-- state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) | |-- subsystem -> ../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) | `-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) |-- nmem1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) [..]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) LIBNDCTL: DIMM enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) Note, in this example we are assuming NFIT-defined DIMMs which are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) identified by an "nfit_handle" a 32-bit value where:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) - Bit 3:0 DIMM number within the memory channel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) - Bit 7:4 memory channel number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) - Bit 11:8 memory controller ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) - Bit 15:12 socket ID (within scope of a Node controller if node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) controller is present)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) - Bit 27:16 Node Controller ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) - Bit 31:28 Reserved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) static struct ndctl_dimm *get_dimm_by_handle(struct ndctl_bus *bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) unsigned int handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) struct ndctl_dimm *dimm;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) ndctl_dimm_foreach(bus, dimm)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) if (ndctl_dimm_get_handle(dimm) == handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) return dimm;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) #define DIMM_HANDLE(n, s, i, c, d) \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) (((n & 0xfff) << 16) | ((s & 0xf) << 12) | ((i & 0xf) << 8) \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) | ((c & 0xf) << 4) | (d & 0xf))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) dimm = get_dimm_by_handle(bus, DIMM_HANDLE(0, 0, 0, 0, 0));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) LIBNVDIMM/LIBNDCTL: Region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) A generic REGION device is registered for each PMEM range or BLK-aperture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) set. Per the example there are 6 regions: 2 PMEM and 4 BLK-aperture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474) sets on the "nfit_test.0" bus. The primary role of regions are to be a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475) container of "mappings". A mapping is a tuple of <DIMM,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) DPA-start-offset, length>.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) LIBNVDIMM provides a built-in driver for these REGION devices. This driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) is responsible for reconciling the aliased DPA mappings across all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) regions, parsing the LABEL, if present, and then emitting NAMESPACE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481) devices with the resolved/exclusive DPA-boundaries for the nd_pmem or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) nd_blk device driver to consume.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) In addition to the generic attributes of "mapping"s, "interleave_ways"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485) and "size" the REGION device also exports some convenience attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) "nstype" indicates the integer type of namespace-device this region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) emits, "devtype" duplicates the DEVTYPE variable stored by udev at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) 'add' event, "modalias" duplicates the MODALIAS variable stored by udev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) at the 'add' event, and finally, the optional "spa_index" is provided in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490) the case where the region is defined by a SPA.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) LIBNVDIMM: region::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) struct nd_region *nvdimm_pmem_region_create(struct nvdimm_bus *nvdimm_bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) struct nd_region_desc *ndr_desc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) struct nd_region *nvdimm_blk_region_create(struct nvdimm_bus *nvdimm_bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) struct nd_region_desc *ndr_desc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) /sys/devices/platform/nfit_test.0/ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502) |-- region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) | |-- available_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504) | |-- btt0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) | |-- btt_seed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) | |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) | |-- driver -> ../../../../../bus/nd/drivers/nd_region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) | |-- init_namespaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509) | |-- mapping0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) | |-- mapping1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) | |-- mappings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) | |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) | |-- namespace0.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) | |-- namespace_seed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) | |-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516) | |-- nfit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) | | `-- spa_index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518) | |-- nstype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) | |-- set_cookie
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) | |-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) | |-- subsystem -> ../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) | `-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) |-- region1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) [..]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526) LIBNDCTL: region enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529) Sample region retrieval routines based on NFIT-unique data like
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) "spa_index" (interleave set id) for PMEM and "nfit_handle" (dimm id) for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531) BLK::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533) static struct ndctl_region *get_pmem_region_by_spa_index(struct ndctl_bus *bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) unsigned int spa_index)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) struct ndctl_region *region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538) ndctl_region_foreach(bus, region) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539) if (ndctl_region_get_type(region) != ND_DEVICE_REGION_PMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) continue;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541) if (ndctl_region_get_spa_index(region) == spa_index)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) return region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) static struct ndctl_region *get_blk_region_by_dimm_handle(struct ndctl_bus *bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548) unsigned int handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550) struct ndctl_region *region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552) ndctl_region_foreach(bus, region) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) struct ndctl_mapping *map;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555) if (ndctl_region_get_type(region) != ND_DEVICE_REGION_BLOCK)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556) continue;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557) ndctl_mapping_foreach(region, map) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) struct ndctl_dimm *dimm = ndctl_mapping_get_dimm(map);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) if (ndctl_dimm_get_handle(dimm) == handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561) return region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 565) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 566)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 567)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 568) Why Not Encode the Region Type into the Region Name?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 569) ----------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 570)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 571) At first glance it seems since NFIT defines just PMEM and BLK interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 572) types that we should simply name REGION devices with something derived
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 573) from those type names. However, the ND subsystem explicitly keeps the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 574) REGION name generic and expects userspace to always consider the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 575) region-attributes for four reasons:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 576)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 577) 1. There are already more than two REGION and "namespace" types. For
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 578) PMEM there are two subtypes. As mentioned previously we have PMEM where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 579) the constituent DIMM devices are known and anonymous PMEM. For BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 580) regions the NFIT specification already anticipates vendor specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 581) implementations. The exact distinction of what a region contains is in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 582) the region-attributes not the region-name or the region-devtype.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 583)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 584) 2. A region with zero child-namespaces is a possible configuration. For
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 585) example, the NFIT allows for a DCR to be published without a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 586) corresponding BLK-aperture. This equates to a DIMM that can only accept
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 587) control/configuration messages, but no i/o through a descendant block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 588) device. Again, this "type" is advertised in the attributes ('mappings'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 589) == 0) and the name does not tell you much.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 590)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 591) 3. What if a third major interface type arises in the future? Outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 592) of vendor specific implementations, it's not difficult to envision a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 593) third class of interface type beyond BLK and PMEM. With a generic name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 594) for the REGION level of the device-hierarchy old userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 595) implementations can still make sense of new kernel advertised
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 596) region-types. Userspace can always rely on the generic region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 597) attributes like "mappings", "size", etc and the expected child devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 598) named "namespace". This generic format of the device-model hierarchy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 599) allows the LIBNVDIMM and LIBNDCTL implementations to be more uniform and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 600) future-proof.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 601)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 602) 4. There are more robust mechanisms for determining the major type of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 603) region than a device name. See the next section, How Do I Determine the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 604) Major Type of a Region?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 605)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 606) How Do I Determine the Major Type of a Region?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 607) ----------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 608)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 609) Outside of the blanket recommendation of "use libndctl", or simply
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 610) looking at the kernel header (/usr/include/linux/ndctl.h) to decode the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 611) "nstype" integer attribute, here are some other options.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 612)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 613) 1. module alias lookup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 614) ^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 615)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 616) The whole point of region/namespace device type differentiation is to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 617) decide which block-device driver will attach to a given LIBNVDIMM namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 618) One can simply use the modalias to lookup the resulting module. It's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 619) important to note that this method is robust in the presence of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 620) vendor-specific driver down the road. If a vendor-specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 621) implementation wants to supplant the standard nd_blk driver it can with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 622) minimal impact to the rest of LIBNVDIMM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 623)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 624) In fact, a vendor may also want to have a vendor-specific region-driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 625) (outside of nd_region). For example, if a vendor defined its own LABEL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 626) format it would need its own region driver to parse that LABEL and emit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 627) the resulting namespaces. The output from module resolution is more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 628) accurate than a region-name or region-devtype.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 629)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 630) 2. udev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 631) ^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 632)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 633) The kernel "devtype" is registered in the udev database::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 634)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 635) # udevadm info --path=/devices/platform/nfit_test.0/ndbus0/region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 636) P: /devices/platform/nfit_test.0/ndbus0/region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 637) E: DEVPATH=/devices/platform/nfit_test.0/ndbus0/region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 638) E: DEVTYPE=nd_pmem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 639) E: MODALIAS=nd:t2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 640) E: SUBSYSTEM=nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 641)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 642) # udevadm info --path=/devices/platform/nfit_test.0/ndbus0/region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 643) P: /devices/platform/nfit_test.0/ndbus0/region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 644) E: DEVPATH=/devices/platform/nfit_test.0/ndbus0/region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 645) E: DEVTYPE=nd_blk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 646) E: MODALIAS=nd:t3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 647) E: SUBSYSTEM=nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 648)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 649) ...and is available as a region attribute, but keep in mind that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 650) "devtype" does not indicate sub-type variations and scripts should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 651) really be understanding the other attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 652)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 653) 3. type specific attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 654) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 655)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 656) As it currently stands a BLK-aperture region will never have a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 657) "nfit/spa_index" attribute, but neither will a non-NFIT PMEM region. A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 658) BLK region with a "mappings" value of 0 is, as mentioned above, a DIMM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 659) that does not allow I/O. A PMEM region with a "mappings" value of zero
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 660) is a simple system-physical-address range.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 661)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 662)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 663) LIBNVDIMM/LIBNDCTL: Namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 664) -----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 665)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 666) A REGION, after resolving DPA aliasing and LABEL specified boundaries,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 667) surfaces one or more "namespace" devices. The arrival of a "namespace"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 668) device currently triggers either the nd_blk or nd_pmem driver to load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 669) and register a disk/block device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 670)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 671) LIBNVDIMM: namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 672) ^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 673)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 674) Here is a sample layout from the three major types of NAMESPACE where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 675) namespace0.0 represents DIMM-info-backed PMEM (note that it has a 'uuid'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 676) attribute), namespace2.0 represents a BLK namespace (note it has a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 677) 'sector_size' attribute) that, and namespace6.0 represents an anonymous
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 678) PMEM namespace (note that has no 'uuid' attribute due to not support a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 679) LABEL)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 680)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 681) /sys/devices/platform/nfit_test.0/ndbus0/region0/namespace0.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 682) |-- alt_name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 683) |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 684) |-- dpa_extents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 685) |-- force_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 686) |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 687) |-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 688) |-- resource
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 689) |-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 690) |-- subsystem -> ../../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 691) |-- type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 692) |-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 693) `-- uuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 694) /sys/devices/platform/nfit_test.0/ndbus0/region2/namespace2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 695) |-- alt_name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 696) |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 697) |-- dpa_extents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 698) |-- force_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 699) |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 700) |-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 701) |-- sector_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 702) |-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 703) |-- subsystem -> ../../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 704) |-- type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 705) |-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 706) `-- uuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 707) /sys/devices/platform/nfit_test.1/ndbus1/region6/namespace6.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 708) |-- block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 709) | `-- pmem0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 710) |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 711) |-- driver -> ../../../../../../bus/nd/drivers/pmem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 712) |-- force_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 713) |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 714) |-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 715) |-- resource
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 716) |-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 717) |-- subsystem -> ../../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 718) |-- type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 719) `-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 720)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 721) LIBNDCTL: namespace enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 722) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 723) Namespaces are indexed relative to their parent region, example below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 724) These indexes are mostly static from boot to boot, but subsystem makes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 725) no guarantees in this regard. For a static namespace identifier use its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 726) 'uuid' attribute.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 727)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 728) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 729)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 730) static struct ndctl_namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 731) *get_namespace_by_id(struct ndctl_region *region, unsigned int id)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 732) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 733) struct ndctl_namespace *ndns;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 734)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 735) ndctl_namespace_foreach(region, ndns)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 736) if (ndctl_namespace_get_id(ndns) == id)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 737) return ndns;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 738)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 739) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 740) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 741)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 742) LIBNDCTL: namespace creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 743) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 744)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 745) Idle namespaces are automatically created by the kernel if a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 746) region has enough available capacity to create a new namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 747) Namespace instantiation involves finding an idle namespace and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 748) configuring it. For the most part the setting of namespace attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 749) can occur in any order, the only constraint is that 'uuid' must be set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 750) before 'size'. This enables the kernel to track DPA allocations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 751) internally with a static identifier::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 752)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 753) static int configure_namespace(struct ndctl_region *region,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 754) struct ndctl_namespace *ndns,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 755) struct namespace_parameters *parameters)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 756) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 757) char devname[50];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 758)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 759) snprintf(devname, sizeof(devname), "namespace%d.%d",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 760) ndctl_region_get_id(region), paramaters->id);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 761)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 762) ndctl_namespace_set_alt_name(ndns, devname);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 763) /* 'uuid' must be set prior to setting size! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 764) ndctl_namespace_set_uuid(ndns, paramaters->uuid);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 765) ndctl_namespace_set_size(ndns, paramaters->size);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 766) /* unlike pmem namespaces, blk namespaces have a sector size */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 767) if (parameters->lbasize)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 768) ndctl_namespace_set_sector_size(ndns, parameters->lbasize);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 769) ndctl_namespace_enable(ndns);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 770) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 771)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 772)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 773) Why the Term "namespace"?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 774) ^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 775)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 776) 1. Why not "volume" for instance? "volume" ran the risk of confusing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 777) ND (libnvdimm subsystem) to a volume manager like device-mapper.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 778)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 779) 2. The term originated to describe the sub-devices that can be created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 780) within a NVME controller (see the nvme specification:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 781) https://www.nvmexpress.org/specifications/), and NFIT namespaces are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 782) meant to parallel the capabilities and configurability of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 783) NVME-namespaces.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 784)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 785)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 786) LIBNVDIMM/LIBNDCTL: Block Translation Table "btt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 787) -------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 788)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 789) A BTT (design document: https://pmem.io/2014/09/23/btt.html) is a stacked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 790) block device driver that fronts either the whole block device or a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 791) partition of a block device emitted by either a PMEM or BLK NAMESPACE.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 792)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 793) LIBNVDIMM: btt layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 794) ^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 795)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 796) Every region will start out with at least one BTT device which is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 797) seed device. To activate it set the "namespace", "uuid", and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 798) "sector_size" attributes and then bind the device to the nd_pmem or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 799) nd_blk driver depending on the region type::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 800)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 801) /sys/devices/platform/nfit_test.1/ndbus0/region0/btt0/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 802) |-- namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 803) |-- delete
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 804) |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 805) |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 806) |-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 807) |-- sector_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 808) |-- subsystem -> ../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 809) |-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 810) `-- uuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 811)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 812) LIBNDCTL: btt creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 813) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 814)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 815) Similar to namespaces an idle BTT device is automatically created per
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 816) region. Each time this "seed" btt device is configured and enabled a new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 817) seed is created. Creating a BTT configuration involves two steps of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 818) finding and idle BTT and assigning it to consume a PMEM or BLK namespace::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 819)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 820) static struct ndctl_btt *get_idle_btt(struct ndctl_region *region)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 821) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 822) struct ndctl_btt *btt;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 823)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 824) ndctl_btt_foreach(region, btt)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 825) if (!ndctl_btt_is_enabled(btt)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 826) && !ndctl_btt_is_configured(btt))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 827) return btt;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 828)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 829) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 830) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 831)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 832) static int configure_btt(struct ndctl_region *region,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 833) struct btt_parameters *parameters)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 834) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 835) btt = get_idle_btt(region);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 836)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 837) ndctl_btt_set_uuid(btt, parameters->uuid);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 838) ndctl_btt_set_sector_size(btt, parameters->sector_size);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 839) ndctl_btt_set_namespace(btt, parameters->ndns);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 840) /* turn off raw mode device */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 841) ndctl_namespace_disable(parameters->ndns);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 842) /* turn on btt access */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 843) ndctl_btt_enable(btt);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 844) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 845)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 846) Once instantiated a new inactive btt seed device will appear underneath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 847) the region.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 848)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 849) Once a "namespace" is removed from a BTT that instance of the BTT device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 850) will be deleted or otherwise reset to default values. This deletion is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 851) only at the device model level. In order to destroy a BTT the "info
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 852) block" needs to be destroyed. Note, that to destroy a BTT the media
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 853) needs to be written in raw mode. By default, the kernel will autodetect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 854) the presence of a BTT and disable raw mode. This autodetect behavior
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 855) can be suppressed by enabling raw mode for the namespace via the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 856) ndctl_namespace_set_raw_mode() API.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 857)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 858)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 859) Summary LIBNDCTL Diagram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 860) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 861)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 862) For the given example above, here is the view of the objects as seen by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 863) LIBNDCTL API::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 864)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 865) +---+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 866) |CTX| +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 867) +-+-+ +-> REGION0 +---> NAMESPACE0.0 +--> PMEM8 "pm0.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 868) | | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 869) +-------+ | | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 870) | DIMM0 <-+ | +-> REGION1 +---> NAMESPACE1.0 +--> PMEM6 "pm1.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 871) +-------+ | | | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 872) | DIMM1 <-+ +-v--+ | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 873) +-------+ +-+BUS0+---> REGION2 +-+-> NAMESPACE2.0 +--> ND6 "blk2.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 874) | DIMM2 <-+ +----+ | +---------+ | +--------------+ +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 875) +-------+ | | +-> NAMESPACE2.1 +--> ND5 "blk2.1" | BTT2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 876) | DIMM3 <-+ | +--------------+ +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 877) +-------+ | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 878) +-> REGION3 +-+-> NAMESPACE3.0 +--> ND4 "blk3.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 879) | +---------+ | +--------------+ +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 880) | +-> NAMESPACE3.1 +--> ND3 "blk3.1" | BTT1 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 881) | +--------------+ +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 882) | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 883) +-> REGION4 +---> NAMESPACE4.0 +--> ND2 "blk4.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 884) | +---------+ +--------------+ +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 885) | +---------+ +--------------+ +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 886) +-> REGION5 +---> NAMESPACE5.0 +--> ND1 "blk5.0" | BTT0 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 887) +---------+ +--------------+ +---------------+------+