^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) dm_bow (backup on write)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) dm_bow is a device mapper driver that uses the free space on a device to back up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) data that is overwritten. The changes can then be committed by a simple state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) change, or rolled back by removing the dm_bow device and running a command line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) utility over the underlying device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) dm_bow has three states, set by writing ‘1’ or ‘2’ to /sys/block/dm-?/bow/state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) It is only possible to go from state 0 (initial state) to state 1, and then from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) state 1 to state 2.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) State 0: dm_bow collects all trims to the device and assumes that these mark
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) free space on the overlying file system that can be safely used. Typically the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) mount code would create the dm_bow device, mount the file system, call the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) FITRIM ioctl on the file system then switch to state 1. These trims are not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) propagated to the underlying device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) State 1: All writes to the device cause the underlying data to be backed up to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) the free (trimmed) area as needed in such a way as they can be restored.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) However, the writes, with one exception, then happen exactly as they would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) without dm_bow, so the device is always in a good final state. The exception is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) that sector 0 is used to keep a log of the latest changes, both to indicate that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) we are in this state and to allow rollback. See below for all details. If there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) isn't enough free space, writes are failed with -ENOSPC.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) State 2: The transition to state 2 triggers replacing the special sector 0 with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) the normal sector 0, and the freeing of all state information. dm_bow then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) becomes a pass-through driver, allowing the device to continue to be used with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) minimal performance impact.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) Usage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) dm-bow takes one command line parameter, the name of the underlying device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) dm-bow will typically be used in the following way. dm-bow will be loaded with a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) suitable underlying device and the resultant device will be mounted. A file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) system trim will be issued via the FITRIM ioctl, then the device will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) switched to state 1. The file system will now be used as normal. At some point,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) the changes can either be committed by switching to state 2, or rolled back by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) unmounting the file system, removing the dm-bow device and running the command
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) line utility. Note that rebooting the device will be equivalent to unmounting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) and removing, but the command line utility must still be run
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) Details of operation in state 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) dm_bow maintains a type for all sectors. A sector can be any of:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) SECTOR0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) SECTOR0_CURRENT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) UNCHANGED
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) FREE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) CHANGED
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) BACKUP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) SECTOR0 is the first sector on the device, and is used to hold the log of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) changes. This is the one exception.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) SECTOR0_CURRENT is a sector picked from the FREE sectors, and is where reads and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) writes from the true sector zero are redirected to. Note that like any backup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) sector, if the sector is written to directly, it must be moved again.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) UNCHANGED means that the sector has not been changed since we entered state 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) Thus if it is written to or trimmed, the contents must first be backed up.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) FREE means that the sector was trimmed in state 0 and has not yet been written
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) to or used for backup. On being written to, a FREE sector is changed to CHANGED.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) CHANGED means that the sector has been modified, and can be further modified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) without further backup.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) BACKUP means that this is a free sector being used as a backup. On being written
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) to, the contents must first be backed up again.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) All backup operations are logged to the first sector. The log sector has the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) format:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) --------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) | Magic | Count | Sequence | Log entry | Log entry | …
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) --------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) Magic is a magic number. Count is the number of log entries. Sequence is 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) initially. A log entry is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) -----------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) | Source | Dest | Size | Checksum |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) -----------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) When SECTOR0 is full, the log sector is backed up and another empty log sector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) created with sequence number one higher. The first entry in any log entry with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) sequence > 0 therefore must be the log of the backing up of the previous log
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) sector. Note that sequence is not strictly needed, but is a useful sanity check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) and potentially limits the time spent trying to restore a corrupted snapshot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) On entering state 1, dm_bow has a list of free sectors. All other sectors are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) unchanged. Sector0_current is selected from the free sectors and the contents of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) sector 0 are copied there. The sector 0 is backed up, which triggers the first
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) log entry to be written.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)