^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ==========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Reference counting in pnfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ==========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) The are several inter-related caches. We have layouts which can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) reference multiple devices, each of which can reference multiple data servers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Each data server can be referenced by multiple devices. Each device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) can be referenced by multiple layouts. To keep all of this straight,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) we need to reference count.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) struct pnfs_layout_hdr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) ======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) The on-the-wire command LAYOUTGET corresponds to struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) pnfs_layout_segment, usually referred to by the variable name lseg.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Each nfs_inode may hold a pointer to a cache of these layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) segments in nfsi->layout, of type struct pnfs_layout_hdr.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) We reference the header for the inode pointing to it, across each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) LAYOUTCOMMIT), and for each lseg held within.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) Each header is also (when non-empty) put on a list associated with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) struct nfs_client (cl_layouts). Being put on this list does not bump
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) the reference count, as the layout is kept around by the lseg that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) keeps it in the list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) deviceid_cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) lsegs reference device ids, which are resolved per nfs_client and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) layout driver type. The device ids are held in a RCU cache (struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) nfs4_deviceid_cache). The cache itself is referenced across each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) mount. The entries (struct nfs4_deviceid) themselves are held across
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) the lifetime of each lseg referencing them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) RCU is used because the deviceid is basically a write once, read many
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) data structure. The hlist size of 32 buckets needs better
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) justification, but seems reasonable given that we can have multiple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) deviceid's per filesystem, and multiple filesystems per nfs_client.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) The hash code is copied from the nfsd code base. A discussion of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) hashing and variations of this algorithm can be found `here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) <http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809>`_
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) data server cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) file driver devices refer to data servers, which are kept in a module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) level cache. Its reference is held over the lifetime of the deviceid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) pointing to it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) lseg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) ====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) lseg maintains an extra reference corresponding to the NFS_LSEG_VALID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) bit which holds it in the pnfs_layout_hdr's list. When the final lseg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) bit is set, preventing any new lsegs from being added.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) layout drivers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) PNFS utilizes what is called layout drivers. The STD defines 4 basic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) layout types: "files", "objects", "blocks", and "flexfiles". For each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) of these types there is a layout-driver with a common function-vectors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) table which are called by the nfs-client pnfs-core to implement the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) different layout types.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) Files-layout-driver code is in: fs/nfs/filelayout/.. directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) Blocks-layout-driver code is in: fs/nfs/blocklayout/.. directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) Flexfiles-layout-driver code is in: fs/nfs/flexfilelayout/.. directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) blocks-layout setup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) TODO: Document the setup needs of the blocks layout driver