Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) LIBNVDIMM: Non-Volatile Devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) libnvdimm - kernel / libndctl - userspace helper library
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) linux-nvdimm@lists.01.org
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) Version 13
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) .. contents:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 	Glossary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 	Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 	    Supporting Documents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 	    Git Trees
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 	LIBNVDIMM PMEM and BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 	Why BLK?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 	    PMEM vs BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 	        BLK-REGIONs, PMEM-REGIONs, Atomic Sectors, and DAX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 	Example NVDIMM Platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 	LIBNVDIMM Kernel Device Model and LIBNDCTL Userspace API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) 	    LIBNDCTL: Context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 	        libndctl: instantiate a new library context example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 	    LIBNVDIMM/LIBNDCTL: Bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 	        libnvdimm: control class device in /sys/class
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 	        libnvdimm: bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 	        libndctl: bus enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 	    LIBNVDIMM/LIBNDCTL: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 	        libnvdimm: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 	        libndctl: DIMM enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 	    LIBNVDIMM/LIBNDCTL: Region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 	        libnvdimm: region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 	        libndctl: region enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 	        Why Not Encode the Region Type into the Region Name?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 	        How Do I Determine the Major Type of a Region?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 	    LIBNVDIMM/LIBNDCTL: Namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 	        libnvdimm: namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 	        libndctl: namespace enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 	        libndctl: namespace creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) 	        Why the Term "namespace"?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 	    LIBNVDIMM/LIBNDCTL: Block Translation Table "btt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 	        libnvdimm: btt layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 	        libndctl: btt creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 	Summary LIBNDCTL Diagram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) Glossary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) PMEM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52)   A system-physical-address range where writes are persistent.  A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53)   block device composed of PMEM is capable of DAX.  A PMEM address range
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54)   may span an interleave of several DIMMs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) BLK:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57)   A set of one or more programmable memory mapped apertures provided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58)   by a DIMM to access its media.  This indirection precludes the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59)   performance benefit of interleaving, but enables DIMM-bounded failure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60)   modes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) DPA:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63)   DIMM Physical Address, is a DIMM-relative offset.  With one DIMM in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64)   the system there would be a 1:1 system-physical-address:DPA association.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65)   Once more DIMMs are added a memory controller interleave must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66)   decoded to determine the DPA associated with a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67)   system-physical-address.  BLK capacity always has a 1:1 relationship
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68)   with a single-DIMM's DPA range.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) DAX:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71)   File system extensions to bypass the page cache and block layer to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72)   mmap persistent memory, from a PMEM block device, directly into a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73)   process address space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) DSM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76)   Device Specific Method: ACPI method to control specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77)   device - in this case the firmware.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) DCR:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80)   NVDIMM Control Region Structure defined in ACPI 6 Section 5.2.25.5.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81)   It defines a vendor-id, device-id, and interface format for a given DIMM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) BTT:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84)   Block Translation Table: Persistent memory is byte addressable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85)   Existing software may have an expectation that the power-fail-atomicity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86)   of writes is at least one sector, 512 bytes.  The BTT is an indirection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87)   table with atomic update semantics to front a PMEM/BLK block device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88)   driver and present arbitrary atomic sector sizes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) LABEL:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91)   Metadata stored on a DIMM device that partitions and identifies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92)   (persistently names) storage between PMEM and BLK.  It also partitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93)   BLK storage to host BTTs with different parameters per BLK-partition.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94)   Note that traditional partition tables, GPT/MBR, are layered on top of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95)   BLK or PMEM device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) The LIBNVDIMM subsystem provides support for three types of NVDIMMs, namely,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) PMEM, BLK, and NVDIMM devices that can simultaneously support both PMEM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) and BLK mode access.  These three modes of operation are described by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) the "NVDIMM Firmware Interface Table" (NFIT) in ACPI 6.  While the LIBNVDIMM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) implementation is generic and supports pre-NFIT platforms, it was guided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) by the superset of capabilities need to support this ACPI 6 definition
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) for NVDIMM resources.  The bulk of the kernel implementation is in place
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) to handle the case where DPA accessible via PMEM is aliased with DPA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) accessible via BLK.  When that occurs a LABEL is needed to reserve DPA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) for exclusive access via one mode a time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) Supporting Documents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) ACPI 6:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 	https://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) NVDIMM Namespace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 	https://pmem.io/documents/NVDIMM_Namespace_Spec.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) DSM Interface Example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 	https://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) Driver Writer's Guide:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 	https://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Git Trees
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) LIBNVDIMM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 	https://git.kernel.org/cgit/linux/kernel/git/djbw/nvdimm.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) LIBNDCTL:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 	https://github.com/pmem/ndctl.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) PMEM:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 	https://github.com/01org/prd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) LIBNVDIMM PMEM and BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) ======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) Prior to the arrival of the NFIT, non-volatile memory was described to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) system in various ad-hoc ways.  Usually only the bare minimum was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) provided, namely, a single system-physical-address range where writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) are expected to be durable after a system power loss.  Now, the NFIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) specification standardizes not only the description of PMEM, but also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) BLK and platform message-passing entry points for control and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) configuration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) For each NVDIMM access method (PMEM, BLK), LIBNVDIMM provides a block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) device driver:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149)     1. PMEM (nd_pmem.ko): Drives a system-physical-address range.  This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)        range is contiguous in system memory and may be interleaved (hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)        memory controller striped) across multiple DIMMs.  When interleaved the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)        platform may optionally provide details of which DIMMs are participating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)        in the interleave.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)        Note that while LIBNVDIMM describes system-physical-address ranges that may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)        alias with BLK access as ND_NAMESPACE_PMEM ranges and those without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)        alias as ND_NAMESPACE_IO ranges, to the nd_pmem driver there is no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)        distinction.  The different device-types are an implementation detail
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159)        that userspace can exploit to implement policies like "only interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)        with address ranges from certain DIMMs".  It is worth noting that when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)        aliasing is present and a DIMM lacks a label, then no block device can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)        be created by default as userspace needs to do at least one allocation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)        of DPA to the PMEM range.  In contrast ND_NAMESPACE_IO ranges, once
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164)        registered, can be immediately attached to nd_pmem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)     2. BLK (nd_blk.ko): This driver performs I/O using a set of platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)        defined apertures.  A set of apertures will access just one DIMM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)        Multiple windows (apertures) allow multiple concurrent accesses, much like
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)        tagged-command-queuing, and would likely be used by different threads or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)        different CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)        The NFIT specification defines a standard format for a BLK-aperture, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173)        the spec also allows for vendor specific layouts, and non-NFIT BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)        implementations may have other designs for BLK I/O.  For this reason
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)        "nd_blk" calls back into platform-specific code to perform the I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177)        One such implementation is defined in the "Driver Writer's Guide" and "DSM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178)        Interface Example".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) Why BLK?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) While PMEM provides direct byte-addressable CPU-load/store access to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) NVDIMM storage, it does not provide the best system RAS (recovery,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) availability, and serviceability) model.  An access to a corrupted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) system-physical-address address causes a CPU exception while an access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) to a corrupted address through an BLK-aperture causes that block window
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) to raise an error status in a register.  The latter is more aligned with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) the standard error model that host-bus-adapter attached disks present.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) Also, if an administrator ever wants to replace a memory it is easier to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) service a system at DIMM module boundaries.  Compare this to PMEM where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) data could be interleaved in an opaque hardware specific manner across
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) several DIMMs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) PMEM vs BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) BLK-apertures solve these RAS problems, but their presence is also the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) major contributing factor to the complexity of the ND subsystem.  They
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) complicate the implementation because PMEM and BLK alias in DPA space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) Any given DIMM's DPA-range may contribute to one or more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) system-physical-address sets of interleaved DIMMs, *and* may also be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) accessed in its entirety through its BLK-aperture.  Accessing a DPA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) through a system-physical-address while simultaneously accessing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) same DPA through a BLK-aperture has undefined results.  For this reason,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) DIMMs with this dual interface configuration include a DSM function to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) store/retrieve a LABEL.  The LABEL effectively partitions the DPA-space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) into exclusive system-physical-address and BLK-aperture accessible
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) regions.  For simplicity a DIMM is allowed a PMEM "region" per each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) interleave set in which it is a member.  The remaining DPA space can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) carved into an arbitrary number of BLK devices with discontiguous
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) extents.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) BLK-REGIONs, PMEM-REGIONs, Atomic Sectors, and DAX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) One of the few
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) reasons to allow multiple BLK namespaces per REGION is so that each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) BLK-namespace can be configured with a BTT with unique atomic sector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) sizes.  While a PMEM device can host a BTT the LABEL specification does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) not provide for a sector size to be specified for a PMEM namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) This is due to the expectation that the primary usage model for PMEM is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) via DAX, and the BTT is incompatible with DAX.  However, for the cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) where an application or filesystem still needs atomic sector update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) guarantees it can register a BTT on a PMEM device or partition.  See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) LIBNVDIMM/NDCTL: Block Translation Table "btt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) Example NVDIMM Platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) For the remainder of this document the following diagram will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) referenced for any example sysfs layouts::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239)                                (a)               (b)           DIMM   BLK-REGION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240)             +-------------------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241)   +------+  |       pm0.0       | blk2.0 | pm1.0  | blk2.1 |    0      region2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242)   | imc0 +--+- - - region0- - - +--------+        +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243)   +--+---+  |       pm0.0       | blk3.0 | pm1.0  | blk3.1 |    1      region3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244)      |      +-------------------+--------v        v--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)   +--+---+                               |                 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246)   | cpu0 |                                     region1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247)   +--+---+                               |                 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248)      |      +----------------------------^        ^--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249)   +--+---+  |           blk4.0           | pm1.0  | blk4.0 |    2      region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250)   | imc1 +--+----------------------------|        +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251)   +------+  |           blk5.0           | pm1.0  | blk5.0 |    3      region5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252)             +----------------------------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) In this platform we have four DIMMs and two memory controllers in one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) socket.  Each unique interface (BLK or PMEM) to DPA space is identified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) by a region device with a dynamically assigned id (REGION0 - REGION5).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258)     1. The first portion of DIMM0 and DIMM1 are interleaved as REGION0. A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259)        single PMEM namespace is created in the REGION0-SPA-range that spans most
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260)        of DIMM0 and DIMM1 with a user-specified name of "pm0.0". Some of that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261)        interleaved system-physical-address range is reclaimed as BLK-aperture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)        accessed space starting at DPA-offset (a) into each DIMM.  In that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263)        reclaimed space we create two BLK-aperture "namespaces" from REGION2 and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264)        REGION3 where "blk2.0" and "blk3.0" are just human readable names that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265)        could be set to any user-desired name in the LABEL.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)     2. In the last portion of DIMM0 and DIMM1 we have an interleaved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)        system-physical-address range, REGION1, that spans those two DIMMs as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269)        well as DIMM2 and DIMM3.  Some of REGION1 is allocated to a PMEM namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)        named "pm1.0", the rest is reclaimed in 4 BLK-aperture namespaces (for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271)        each DIMM in the interleave set), "blk2.1", "blk3.1", "blk4.0", and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)        "blk5.0".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274)     3. The portion of DIMM2 and DIMM3 that do not participate in the REGION1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275)        interleaved system-physical-address range (i.e. the DPA address past
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276)        offset (b) are also included in the "blk4.0" and "blk5.0" namespaces.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277)        Note, that this example shows that BLK-aperture namespaces don't need to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278)        be contiguous in DPA-space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280)     This bus is provided by the kernel under the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281)     /sys/devices/platform/nfit_test.0 when the nfit_test.ko module from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282)     tools/testing/nvdimm is loaded.  This not only test LIBNVDIMM but the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283)     acpi_nfit.ko driver as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) LIBNVDIMM Kernel Device Model and LIBNDCTL Userspace API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) ========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) What follows is a description of the LIBNVDIMM sysfs layout and a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) corresponding object hierarchy diagram as viewed through the LIBNDCTL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) API.  The example sysfs paths and diagrams are relative to the Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) NVDIMM Platform which is also the LIBNVDIMM bus used in the LIBNDCTL unit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) LIBNDCTL: Context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) Every API call in the LIBNDCTL library requires a context that holds the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) logging parameters and other library instance state.  The library is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) based on the libabc template:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) 	https://git.kernel.org/cgit/linux/kernel/git/kay/libabc.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) LIBNDCTL: instantiate a new library context example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) 	struct ndctl_ctx *ctx;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) 	if (ndctl_new(&ctx) == 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 		return ctx;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) 	else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) 		return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) LIBNVDIMM/LIBNDCTL: Bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) A bus has a 1:1 relationship with an NFIT.  The current expectation for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) ACPI based systems is that there is only ever one platform-global NFIT.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) That said, it is trivial to register multiple NFITs, the specification
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) does not preclude it.  The infrastructure supports multiple busses and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) we use this capability to test multiple NFIT configurations in the unit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) LIBNVDIMM: control class device in /sys/class
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) ---------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) This character device accepts DSM messages to be passed to DIMM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) identified by its NFIT handle::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) 	/sys/class/nd/ndctl0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) 	|-- dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) 	|-- device -> ../../../ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) 	|-- subsystem -> ../../../../../../../class/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) LIBNVDIMM: bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) 	struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) 	       struct nvdimm_bus_descriptor *nfit_desc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 	/sys/devices/platform/nfit_test.0/ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) 	|-- commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) 	|-- nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) 	|-- nfit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) 	|-- nmem0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) 	|-- nmem1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) 	|-- nmem2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) 	|-- nmem3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) 	|-- power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) 	|-- provider
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) 	|-- region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 	|-- region1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 	|-- region2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) 	|-- region3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) 	|-- region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 	|-- region5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) 	|-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 	`-- wait_probe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) LIBNDCTL: bus enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) Find the bus handle that describes the bus from Example NVDIMM Platform::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) 	static struct ndctl_bus *get_bus_by_provider(struct ndctl_ctx *ctx,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) 			const char *provider)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) 		struct ndctl_bus *bus;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) 		ndctl_bus_foreach(ctx, bus)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) 			if (strcmp(provider, ndctl_bus_get_provider(bus)) == 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) 				return bus;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) 		return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) 	bus = get_bus_by_provider(ctx, "nfit_test.0");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) LIBNVDIMM/LIBNDCTL: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) -------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) The DIMM device provides a character device for sending commands to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) hardware, and it is a container for LABELs.  If the DIMM is defined by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) NFIT then an optional 'nfit' attribute sub-directory is available to add
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) NFIT-specifics.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) Note that the kernel device name for "DIMMs" is "nmemX".  The NFIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) describes these devices via "Memory Device to System Physical Address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) Range Mapping Structure", and there is no requirement that they actually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) be physical DIMMs, so we use a more generic name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) LIBNVDIMM: DIMM (NMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) ^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) 	struct nvdimm *nvdimm_create(struct nvdimm_bus *nvdimm_bus, void *provider_data,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) 			const struct attribute_group **groups, unsigned long flags,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) 			unsigned long *dsm_mask);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) 	/sys/devices/platform/nfit_test.0/ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) 	|-- nmem0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) 	|   |-- available_slots
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) 	|   |-- commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) 	|   |-- dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) 	|   |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) 	|   |-- driver -> ../../../../../bus/nd/drivers/nvdimm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) 	|   |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) 	|   |-- nfit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) 	|   |   |-- device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) 	|   |   |-- format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) 	|   |   |-- handle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) 	|   |   |-- phys_id
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) 	|   |   |-- rev_id
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) 	|   |   |-- serial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) 	|   |   `-- vendor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) 	|   |-- state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) 	|   |-- subsystem -> ../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) 	|   `-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) 	|-- nmem1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) 	[..]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) LIBNDCTL: DIMM enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) Note, in this example we are assuming NFIT-defined DIMMs which are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) identified by an "nfit_handle" a 32-bit value where:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441)    - Bit 3:0 DIMM number within the memory channel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442)    - Bit 7:4 memory channel number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443)    - Bit 11:8 memory controller ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444)    - Bit 15:12 socket ID (within scope of a Node controller if node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445)      controller is present)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446)    - Bit 27:16 Node Controller ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447)    - Bit 31:28 Reserved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) 	static struct ndctl_dimm *get_dimm_by_handle(struct ndctl_bus *bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) 	       unsigned int handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) 		struct ndctl_dimm *dimm;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) 		ndctl_dimm_foreach(bus, dimm)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) 			if (ndctl_dimm_get_handle(dimm) == handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) 				return dimm;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) 		return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) 	#define DIMM_HANDLE(n, s, i, c, d) \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) 		(((n & 0xfff) << 16) | ((s & 0xf) << 12) | ((i & 0xf) << 8) \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) 		 | ((c & 0xf) << 4) | (d & 0xf))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) 	dimm = get_dimm_by_handle(bus, DIMM_HANDLE(0, 0, 0, 0, 0));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) LIBNVDIMM/LIBNDCTL: Region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) A generic REGION device is registered for each PMEM range or BLK-aperture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) set.  Per the example there are 6 regions: 2 PMEM and 4 BLK-aperture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474) sets on the "nfit_test.0" bus.  The primary role of regions are to be a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475) container of "mappings".  A mapping is a tuple of <DIMM,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) DPA-start-offset, length>.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) LIBNVDIMM provides a built-in driver for these REGION devices.  This driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) is responsible for reconciling the aliased DPA mappings across all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) regions, parsing the LABEL, if present, and then emitting NAMESPACE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481) devices with the resolved/exclusive DPA-boundaries for the nd_pmem or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) nd_blk device driver to consume.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) In addition to the generic attributes of "mapping"s, "interleave_ways"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485) and "size" the REGION device also exports some convenience attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) "nstype" indicates the integer type of namespace-device this region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) emits, "devtype" duplicates the DEVTYPE variable stored by udev at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) 'add' event, "modalias" duplicates the MODALIAS variable stored by udev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) at the 'add' event, and finally, the optional "spa_index" is provided in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490) the case where the region is defined by a SPA.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) LIBNVDIMM: region::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) 	struct nd_region *nvdimm_pmem_region_create(struct nvdimm_bus *nvdimm_bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) 			struct nd_region_desc *ndr_desc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) 	struct nd_region *nvdimm_blk_region_create(struct nvdimm_bus *nvdimm_bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) 			struct nd_region_desc *ndr_desc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) 	/sys/devices/platform/nfit_test.0/ndbus0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502) 	|-- region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) 	|   |-- available_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504) 	|   |-- btt0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) 	|   |-- btt_seed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) 	|   |-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) 	|   |-- driver -> ../../../../../bus/nd/drivers/nd_region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) 	|   |-- init_namespaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509) 	|   |-- mapping0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) 	|   |-- mapping1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) 	|   |-- mappings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) 	|   |-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) 	|   |-- namespace0.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) 	|   |-- namespace_seed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) 	|   |-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516) 	|   |-- nfit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) 	|   |   `-- spa_index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518) 	|   |-- nstype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) 	|   |-- set_cookie
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) 	|   |-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) 	|   |-- subsystem -> ../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) 	|   `-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) 	|-- region1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) 	[..]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526) LIBNDCTL: region enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529) Sample region retrieval routines based on NFIT-unique data like
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) "spa_index" (interleave set id) for PMEM and "nfit_handle" (dimm id) for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531) BLK::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533) 	static struct ndctl_region *get_pmem_region_by_spa_index(struct ndctl_bus *bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) 			unsigned int spa_index)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) 		struct ndctl_region *region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538) 		ndctl_region_foreach(bus, region) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539) 			if (ndctl_region_get_type(region) != ND_DEVICE_REGION_PMEM)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) 				continue;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541) 			if (ndctl_region_get_spa_index(region) == spa_index)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) 				return region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543) 		}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) 		return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) 	static struct ndctl_region *get_blk_region_by_dimm_handle(struct ndctl_bus *bus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548) 			unsigned int handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550) 		struct ndctl_region *region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552) 		ndctl_region_foreach(bus, region) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) 			struct ndctl_mapping *map;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555) 			if (ndctl_region_get_type(region) != ND_DEVICE_REGION_BLOCK)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556) 				continue;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557) 			ndctl_mapping_foreach(region, map) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) 				struct ndctl_dimm *dimm = ndctl_mapping_get_dimm(map);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) 				if (ndctl_dimm_get_handle(dimm) == handle)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561) 					return region;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562) 			}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563) 		}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564) 		return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 565) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 566) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 567) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 568) Why Not Encode the Region Type into the Region Name?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 569) ----------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 570) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 571) At first glance it seems since NFIT defines just PMEM and BLK interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 572) types that we should simply name REGION devices with something derived
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 573) from those type names.  However, the ND subsystem explicitly keeps the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 574) REGION name generic and expects userspace to always consider the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 575) region-attributes for four reasons:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 576) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 577)     1. There are already more than two REGION and "namespace" types.  For
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 578)        PMEM there are two subtypes.  As mentioned previously we have PMEM where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 579)        the constituent DIMM devices are known and anonymous PMEM.  For BLK
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 580)        regions the NFIT specification already anticipates vendor specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 581)        implementations.  The exact distinction of what a region contains is in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 582)        the region-attributes not the region-name or the region-devtype.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 583) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 584)     2. A region with zero child-namespaces is a possible configuration.  For
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 585)        example, the NFIT allows for a DCR to be published without a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 586)        corresponding BLK-aperture.  This equates to a DIMM that can only accept
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 587)        control/configuration messages, but no i/o through a descendant block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 588)        device.  Again, this "type" is advertised in the attributes ('mappings'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 589)        == 0) and the name does not tell you much.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 590) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 591)     3. What if a third major interface type arises in the future?  Outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 592)        of vendor specific implementations, it's not difficult to envision a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 593)        third class of interface type beyond BLK and PMEM.  With a generic name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 594)        for the REGION level of the device-hierarchy old userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 595)        implementations can still make sense of new kernel advertised
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 596)        region-types.  Userspace can always rely on the generic region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 597)        attributes like "mappings", "size", etc and the expected child devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 598)        named "namespace".  This generic format of the device-model hierarchy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 599)        allows the LIBNVDIMM and LIBNDCTL implementations to be more uniform and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 600)        future-proof.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 601) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 602)     4. There are more robust mechanisms for determining the major type of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 603)        region than a device name.  See the next section, How Do I Determine the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 604)        Major Type of a Region?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 605) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 606) How Do I Determine the Major Type of a Region?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 607) ----------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 608) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 609) Outside of the blanket recommendation of "use libndctl", or simply
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 610) looking at the kernel header (/usr/include/linux/ndctl.h) to decode the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 611) "nstype" integer attribute, here are some other options.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 612) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 613) 1. module alias lookup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 614) ^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 615) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 616)     The whole point of region/namespace device type differentiation is to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 617)     decide which block-device driver will attach to a given LIBNVDIMM namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 618)     One can simply use the modalias to lookup the resulting module.  It's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 619)     important to note that this method is robust in the presence of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 620)     vendor-specific driver down the road.  If a vendor-specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 621)     implementation wants to supplant the standard nd_blk driver it can with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 622)     minimal impact to the rest of LIBNVDIMM.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 623) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 624)     In fact, a vendor may also want to have a vendor-specific region-driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 625)     (outside of nd_region).  For example, if a vendor defined its own LABEL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 626)     format it would need its own region driver to parse that LABEL and emit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 627)     the resulting namespaces.  The output from module resolution is more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 628)     accurate than a region-name or region-devtype.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 629) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 630) 2. udev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 631) ^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 632) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 633)     The kernel "devtype" is registered in the udev database::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 634) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 635) 	# udevadm info --path=/devices/platform/nfit_test.0/ndbus0/region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 636) 	P: /devices/platform/nfit_test.0/ndbus0/region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 637) 	E: DEVPATH=/devices/platform/nfit_test.0/ndbus0/region0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 638) 	E: DEVTYPE=nd_pmem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 639) 	E: MODALIAS=nd:t2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 640) 	E: SUBSYSTEM=nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 641) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 642) 	# udevadm info --path=/devices/platform/nfit_test.0/ndbus0/region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 643) 	P: /devices/platform/nfit_test.0/ndbus0/region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 644) 	E: DEVPATH=/devices/platform/nfit_test.0/ndbus0/region4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 645) 	E: DEVTYPE=nd_blk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 646) 	E: MODALIAS=nd:t3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 647) 	E: SUBSYSTEM=nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 648) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 649)     ...and is available as a region attribute, but keep in mind that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 650)     "devtype" does not indicate sub-type variations and scripts should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 651)     really be understanding the other attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 652) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 653) 3. type specific attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 654) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 655) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 656)     As it currently stands a BLK-aperture region will never have a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 657)     "nfit/spa_index" attribute, but neither will a non-NFIT PMEM region.  A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 658)     BLK region with a "mappings" value of 0 is, as mentioned above, a DIMM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 659)     that does not allow I/O.  A PMEM region with a "mappings" value of zero
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 660)     is a simple system-physical-address range.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 661) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 662) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 663) LIBNVDIMM/LIBNDCTL: Namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 664) -----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 665) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 666) A REGION, after resolving DPA aliasing and LABEL specified boundaries,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 667) surfaces one or more "namespace" devices.  The arrival of a "namespace"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 668) device currently triggers either the nd_blk or nd_pmem driver to load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 669) and register a disk/block device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 670) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 671) LIBNVDIMM: namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 672) ^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 673) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 674) Here is a sample layout from the three major types of NAMESPACE where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 675) namespace0.0 represents DIMM-info-backed PMEM (note that it has a 'uuid'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 676) attribute), namespace2.0 represents a BLK namespace (note it has a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 677) 'sector_size' attribute) that, and namespace6.0 represents an anonymous
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 678) PMEM namespace (note that has no 'uuid' attribute due to not support a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 679) LABEL)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 680) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 681) 	/sys/devices/platform/nfit_test.0/ndbus0/region0/namespace0.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 682) 	|-- alt_name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 683) 	|-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 684) 	|-- dpa_extents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 685) 	|-- force_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 686) 	|-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 687) 	|-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 688) 	|-- resource
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 689) 	|-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 690) 	|-- subsystem -> ../../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 691) 	|-- type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 692) 	|-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 693) 	`-- uuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 694) 	/sys/devices/platform/nfit_test.0/ndbus0/region2/namespace2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 695) 	|-- alt_name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 696) 	|-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 697) 	|-- dpa_extents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 698) 	|-- force_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 699) 	|-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 700) 	|-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 701) 	|-- sector_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 702) 	|-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 703) 	|-- subsystem -> ../../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 704) 	|-- type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 705) 	|-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 706) 	`-- uuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 707) 	/sys/devices/platform/nfit_test.1/ndbus1/region6/namespace6.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 708) 	|-- block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 709) 	|   `-- pmem0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 710) 	|-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 711) 	|-- driver -> ../../../../../../bus/nd/drivers/pmem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 712) 	|-- force_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 713) 	|-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 714) 	|-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 715) 	|-- resource
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 716) 	|-- size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 717) 	|-- subsystem -> ../../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 718) 	|-- type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 719) 	`-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 720) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 721) LIBNDCTL: namespace enumeration example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 722) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 723) Namespaces are indexed relative to their parent region, example below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 724) These indexes are mostly static from boot to boot, but subsystem makes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 725) no guarantees in this regard.  For a static namespace identifier use its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 726) 'uuid' attribute.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 727) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 728) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 729) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 730)   static struct ndctl_namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 731)   *get_namespace_by_id(struct ndctl_region *region, unsigned int id)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 732)   {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 733)           struct ndctl_namespace *ndns;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 734) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 735)           ndctl_namespace_foreach(region, ndns)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 736)                   if (ndctl_namespace_get_id(ndns) == id)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 737)                           return ndns;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 738) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 739)           return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 740)   }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 741) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 742) LIBNDCTL: namespace creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 743) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 744) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 745) Idle namespaces are automatically created by the kernel if a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 746) region has enough available capacity to create a new namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 747) Namespace instantiation involves finding an idle namespace and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 748) configuring it.  For the most part the setting of namespace attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 749) can occur in any order, the only constraint is that 'uuid' must be set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 750) before 'size'.  This enables the kernel to track DPA allocations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 751) internally with a static identifier::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 752) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 753)   static int configure_namespace(struct ndctl_region *region,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 754)                   struct ndctl_namespace *ndns,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 755)                   struct namespace_parameters *parameters)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 756)   {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 757)           char devname[50];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 758) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 759)           snprintf(devname, sizeof(devname), "namespace%d.%d",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 760)                           ndctl_region_get_id(region), paramaters->id);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 761) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 762)           ndctl_namespace_set_alt_name(ndns, devname);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 763)           /* 'uuid' must be set prior to setting size! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 764)           ndctl_namespace_set_uuid(ndns, paramaters->uuid);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 765)           ndctl_namespace_set_size(ndns, paramaters->size);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 766)           /* unlike pmem namespaces, blk namespaces have a sector size */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 767)           if (parameters->lbasize)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 768)                   ndctl_namespace_set_sector_size(ndns, parameters->lbasize);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 769)           ndctl_namespace_enable(ndns);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 770)   }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 771) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 772) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 773) Why the Term "namespace"?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 774) ^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 775) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 776)     1. Why not "volume" for instance?  "volume" ran the risk of confusing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 777)        ND (libnvdimm subsystem) to a volume manager like device-mapper.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 778) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 779)     2. The term originated to describe the sub-devices that can be created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 780)        within a NVME controller (see the nvme specification:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 781)        https://www.nvmexpress.org/specifications/), and NFIT namespaces are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 782)        meant to parallel the capabilities and configurability of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 783)        NVME-namespaces.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 784) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 785) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 786) LIBNVDIMM/LIBNDCTL: Block Translation Table "btt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 787) -------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 788) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 789) A BTT (design document: https://pmem.io/2014/09/23/btt.html) is a stacked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 790) block device driver that fronts either the whole block device or a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 791) partition of a block device emitted by either a PMEM or BLK NAMESPACE.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 792) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 793) LIBNVDIMM: btt layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 794) ^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 795) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 796) Every region will start out with at least one BTT device which is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 797) seed device.  To activate it set the "namespace", "uuid", and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 798) "sector_size" attributes and then bind the device to the nd_pmem or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 799) nd_blk driver depending on the region type::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 800) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 801) 	/sys/devices/platform/nfit_test.1/ndbus0/region0/btt0/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 802) 	|-- namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 803) 	|-- delete
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 804) 	|-- devtype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 805) 	|-- modalias
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 806) 	|-- numa_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 807) 	|-- sector_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 808) 	|-- subsystem -> ../../../../../bus/nd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 809) 	|-- uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 810) 	`-- uuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 811) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 812) LIBNDCTL: btt creation example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 813) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 814) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 815) Similar to namespaces an idle BTT device is automatically created per
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 816) region.  Each time this "seed" btt device is configured and enabled a new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 817) seed is created.  Creating a BTT configuration involves two steps of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 818) finding and idle BTT and assigning it to consume a PMEM or BLK namespace::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 819) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 820) 	static struct ndctl_btt *get_idle_btt(struct ndctl_region *region)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 821) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 822) 		struct ndctl_btt *btt;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 823) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 824) 		ndctl_btt_foreach(region, btt)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 825) 			if (!ndctl_btt_is_enabled(btt)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 826) 					&& !ndctl_btt_is_configured(btt))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 827) 				return btt;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 828) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 829) 		return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 830) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 831) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 832) 	static int configure_btt(struct ndctl_region *region,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 833) 			struct btt_parameters *parameters)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 834) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 835) 		btt = get_idle_btt(region);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 836) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 837) 		ndctl_btt_set_uuid(btt, parameters->uuid);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 838) 		ndctl_btt_set_sector_size(btt, parameters->sector_size);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 839) 		ndctl_btt_set_namespace(btt, parameters->ndns);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 840) 		/* turn off raw mode device */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 841) 		ndctl_namespace_disable(parameters->ndns);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 842) 		/* turn on btt access */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 843) 		ndctl_btt_enable(btt);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 844) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 845) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 846) Once instantiated a new inactive btt seed device will appear underneath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 847) the region.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 848) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 849) Once a "namespace" is removed from a BTT that instance of the BTT device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 850) will be deleted or otherwise reset to default values.  This deletion is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 851) only at the device model level.  In order to destroy a BTT the "info
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 852) block" needs to be destroyed.  Note, that to destroy a BTT the media
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 853) needs to be written in raw mode.  By default, the kernel will autodetect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 854) the presence of a BTT and disable raw mode.  This autodetect behavior
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 855) can be suppressed by enabling raw mode for the namespace via the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 856) ndctl_namespace_set_raw_mode() API.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 857) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 858) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 859) Summary LIBNDCTL Diagram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 860) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 861) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 862) For the given example above, here is the view of the objects as seen by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 863) LIBNDCTL API::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 864) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 865)               +---+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 866)               |CTX|    +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 867)               +-+-+  +-> REGION0 +---> NAMESPACE0.0 +--> PMEM8 "pm0.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 868)                 |    | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 869)   +-------+     |    | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 870)   | DIMM0 <-+   |    +-> REGION1 +---> NAMESPACE1.0 +--> PMEM6 "pm1.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 871)   +-------+ |   |    | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 872)   | DIMM1 <-+ +-v--+ | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 873)   +-------+ +-+BUS0+---> REGION2 +-+-> NAMESPACE2.0 +--> ND6  "blk2.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 874)   | DIMM2 <-+ +----+ | +---------+ | +--------------+  +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 875)   +-------+ |        |             +-> NAMESPACE2.1 +--> ND5  "blk2.1" | BTT2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 876)   | DIMM3 <-+        |               +--------------+  +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 877)   +-------+          | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 878)                      +-> REGION3 +-+-> NAMESPACE3.0 +--> ND4  "blk3.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 879)                      | +---------+ | +--------------+  +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 880)                      |             +-> NAMESPACE3.1 +--> ND3  "blk3.1" | BTT1 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 881)                      |               +--------------+  +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 882)                      | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 883)                      +-> REGION4 +---> NAMESPACE4.0 +--> ND2  "blk4.0" |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 884)                      | +---------+   +--------------+  +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 885)                      | +---------+   +--------------+  +----------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 886)                      +-> REGION5 +---> NAMESPACE5.0 +--> ND1  "blk5.0" | BTT0 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 887)                        +---------+   +--------------+  +---------------+------+