Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0-only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) dm-clone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) dm-clone is a device mapper target which produces a one-to-one copy of an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) existing, read-only source device into a writable destination device: It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) presents a virtual block device which makes all data appear immediately, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) redirects reads and writes accordingly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) The main use case of dm-clone is to clone a potentially remote, high-latency,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) read-only, archival-type block device into a writable, fast, primary-type device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) for fast, low-latency I/O. The cloned device is visible/mountable immediately
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) and the copy of the source device to the destination device happens in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) background, in parallel with user I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) For example, one could restore an application backup from a read-only copy,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) accessible through a network storage protocol (NBD, Fibre Channel, iSCSI, AoE,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) etc.), into a local SSD or NVMe device, and start using the device immediately,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) without waiting for the restore to complete.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) When the cloning completes, the dm-clone table can be removed altogether and be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) replaced, e.g., by a linear table, mapping directly to the destination device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) The dm-clone target reuses the metadata library used by the thin-provisioning
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) target.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) Glossary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35)    Hydration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)      The process of filling a region of the destination device with data from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37)      the same region of the source device, i.e., copying the region from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38)      source to the destination device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) Once a region gets hydrated we redirect all I/O regarding it to the destination
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) Design
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) ======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) Sub-devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) The target is constructed by passing three devices to it (along with other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) parameters detailed later):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 1. A source device - the read-only device that gets cloned and source of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53)    hydration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 2. A destination device - the destination of the hydration, which will become a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56)    clone of the source device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) 3. A small metadata device - it records which regions are already valid in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59)    destination device, i.e., which regions have already been hydrated, or have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60)    been written to directly, via user I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) The size of the destination device must be at least equal to the size of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) source device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) Regions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) dm-clone divides the source and destination devices in fixed sized regions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) Regions are the unit of hydration, i.e., the minimum amount of data copied from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) the source to the destination device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) The region size is configurable when you first create the dm-clone device. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) recommended region size is the same as the file system block size, which usually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) is 4KB. The region size must be between 8 sectors (4KB) and 2097152 sectors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) (1GB) and a power of two.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) Reads and writes from/to hydrated regions are serviced from the destination
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) A read to a not yet hydrated region is serviced directly from the source device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) A write to a not yet hydrated region will be delayed until the corresponding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) region has been hydrated and the hydration of the region starts immediately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) Note that a write request with size equal to region size will skip copying of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) the corresponding region from the source device and overwrite the region of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) destination device directly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) Discards
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) dm-clone interprets a discard request to a range that hasn't been hydrated yet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) as a hint to skip hydration of the regions covered by the request, i.e., it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) skips copying the region's data from the source to the destination device, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) only updates its metadata.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) If the destination device supports discards, then by default dm-clone will pass
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) down discard requests to it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) Background Hydration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) dm-clone copies continuously from the source to the destination device, until
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) all of the device has been copied.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) Copying data from the source to the destination device uses bandwidth. The user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) can set a throttle to prevent more than a certain amount of copying occurring at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) any one time. Moreover, dm-clone takes into account user I/O traffic going to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) the devices and pauses the background hydration when there is I/O in-flight.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) A message `hydration_threshold <#regions>` can be used to set the maximum number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) of regions being copied, the default being 1 region.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) dm-clone employs dm-kcopyd for copying portions of the source device to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) destination device. By default, we issue copy requests of size equal to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) region size. A message `hydration_batch_size <#regions>` can be used to tune the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) size of these copy requests. Increasing the hydration batch size results in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) dm-clone trying to batch together contiguous regions, so we copy the data in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) batches of this many regions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) When the hydration of the destination device finishes, a dm event will be sent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) to user space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Updating on-disk metadata
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) -------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) On-disk metadata is committed every time a FLUSH or FUA bio is written. If no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) such requests are made then commits will occur every second. This means the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) dm-clone device behaves like a physical disk that has a volatile write cache. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) power is lost you may lose some recent writes. The metadata should always be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) consistent in spite of any crash.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) Target Interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) Constructor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)   ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)    clone <metadata dev> <destination dev> <source dev> <region size>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142)          [<#feature args> [<feature arg>]* [<#core args> [<core arg>]*]]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)  ================ ==============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)  metadata dev     Fast device holding the persistent metadata
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)  destination dev  The destination device, where the source will be cloned
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)  source dev       Read only device containing the data that gets cloned
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)  region size      The size of a region in sectors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)  #feature args    Number of feature arguments passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)  feature args     no_hydration or no_discard_passdown
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)  #core args       An even number of arguments corresponding to key/value pairs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)                   passed to dm-clone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)  core args        Key/value pairs passed to dm-clone, e.g. `hydration_threshold
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)                   256`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)  ================ ==============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) Optional feature arguments are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)  ==================== =========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)  no_hydration         Create a dm-clone instance with background hydration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)                       disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164)  no_discard_passdown  Disable passing down discards to the destination device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)  ==================== =========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) Optional core arguments are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)  ================================ ==============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)  hydration_threshold <#regions>   Maximum number of regions being copied from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)                                   the source to the destination device at any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)                                   one time, during background hydration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173)  hydration_batch_size <#regions>  During background hydration, try to batch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)                                   together contiguous regions, so we copy data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)                                   from the source to the destination device in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)                                   batches of this many regions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177)  ================================ ==============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) Status
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) ------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182)   ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184)    <metadata block size> <#used metadata blocks>/<#total metadata blocks>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185)    <region size> <#hydrated regions>/<#total regions> <#hydrating regions>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186)    <#feature args> <feature args>* <#core args> <core args>*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187)    <clone metadata mode>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189)  ======================= =======================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190)  metadata block size     Fixed block size for each metadata block in sectors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)  #used metadata blocks   Number of metadata blocks used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192)  #total metadata blocks  Total number of metadata blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193)  region size             Configurable region size for the device in sectors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194)  #hydrated regions       Number of regions that have finished hydrating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)  #total regions          Total number of regions to hydrate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196)  #hydrating regions      Number of regions currently hydrating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197)  #feature args           Number of feature arguments to follow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198)  feature args            Feature arguments, e.g. `no_hydration`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199)  #core args              Even number of core arguments to follow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200)  core args               Key/value pairs for tuning the core, e.g.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201)                          `hydration_threshold 256`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202)  clone metadata mode     ro if read-only, rw if read-write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)                          In serious cases where even a read-only mode is deemed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205)                          unsafe no further I/O will be permitted and the status
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)                          will just contain the string 'Fail'. If the metadata
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207)                          mode changes, a dm event will be sent to user space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208)  ======================= =======================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) Messages
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213)   `disable_hydration`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214)       Disable the background hydration of the destination device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216)   `enable_hydration`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217)       Enable the background hydration of the destination device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219)   `hydration_threshold <#regions>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)       Set background hydration threshold.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222)   `hydration_batch_size <#regions>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223)       Set background hydration batch size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) Examples
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) Clone a device containing a file system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) ---------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) 1. Create the dm-clone device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)    ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235)     dmsetup create clone --table "0 1048576000 clone $metadata_dev $dest_dev \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)       $source_dev 8 1 no_hydration"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 2. Mount the device and trim the file system. dm-clone interprets the discards
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239)    sent by the file system and it will not hydrate the unused space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241)    ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243)     mount /dev/mapper/clone /mnt/cloned-fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244)     fstrim /mnt/cloned-fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 3. Enable background hydration of the destination device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248)    ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250)     dmsetup message clone 0 enable_hydration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) 4. When the hydration finishes, we can replace the dm-clone table with a linear
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)    table.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)    ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)     dmsetup suspend clone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258)     dmsetup load clone --table "0 1048576000 linear $dest_dev 0"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259)     dmsetup resume clone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261)    The metadata device is no longer needed and can be safely discarded or reused
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)    for other purposes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) Known issues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 1. We redirect reads, to not-yet-hydrated regions, to the source device. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)    reading the source device has high latency and the user repeatedly reads from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269)    the same regions, this behaviour could degrade performance. We should use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)    these reads as hints to hydrate the relevant regions sooner. Currently, we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271)    rely on the page cache to cache these regions, so we hopefully don't end up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)    reading them multiple times from the source device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) 2. Release in-core resources, i.e., the bitmaps tracking which regions are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275)    hydrated, after the hydration has finished.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) 3. During background hydration, if we fail to read the source or write to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278)    destination device, we print an error message, but the hydration process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)    continues indefinitely, until it succeeds. We should stop the background
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280)    hydration after a number of failures and emit a dm event for user space to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281)    notice.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) Why not...?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) We explored the following alternatives before implementing dm-clone:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 1. Use dm-cache with cache size equal to the source device and implement a new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289)    cloning policy:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291)    * The resulting cache device is not a one-to-one mirror of the source device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292)      and thus we cannot remove the cache device once cloning completes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294)    * dm-cache writes to the source device, which violates our requirement that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)      the source device must be treated as read-only.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297)    * Caching is semantically different from cloning.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) 2. Use dm-snapshot with a COW device equal to the source device:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301)    * dm-snapshot stores its metadata in the COW device, so the resulting device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302)      is not a one-to-one mirror of the source device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304)    * No background copying mechanism.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)    * dm-snapshot needs to commit its metadata whenever a pending exception
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307)      completes, to ensure snapshot consistency. In the case of cloning, we don't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)      need to be so strict and can rely on committing metadata every time a FLUSH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309)      or FUA bio is written, or periodically, like dm-thin and dm-cache do. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)      improves the performance significantly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 3. Use dm-mirror: The mirror target has a background copying/mirroring
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313)    mechanism, but it writes to all mirrors, thus violating our requirement that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314)    the source device must be treated as read-only.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) 4. Use dm-thin's external snapshot functionality. This approach is the most
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317)    promising among all alternatives, as the thinly-provisioned volume is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318)    one-to-one mirror of the source device and handles reads and writes to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319)    un-provisioned/not-yet-cloned areas the same way as dm-clone does.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321)    Still:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323)    * There is no background copying mechanism, though one could be implemented.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325)    * Most importantly, we want to support arbitrary block devices as the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326)      destination of the cloning process and not restrict ourselves to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327)      thinly-provisioned volumes. Thin-provisioning has an inherent metadata
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328)      overhead, for maintaining the thin volume mappings, which significantly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329)      degrades performance.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331)    Moreover, cloning a device shouldn't force the use of thin-provisioning. On
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332)    the other hand, if we wish to use thin provisioning, we can just use a thin
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333)    LV as dm-clone's destination device.