^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) dm-service-time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) dm-service-time is a path selector module for device-mapper targets,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) which selects a path with the shortest estimated service time for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) the incoming I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) The service time for each path is estimated by dividing the total size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) of in-flight I/Os on a path with the performance value of the path.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) The performance value is a relative throughput value among all paths
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) in a path-group, and it can be specified as a table argument.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) The path selector name is 'service-time'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) Table parameters for each path:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) [<repeat_count> [<relative_throughput>]]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) <repeat_count>:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) The number of I/Os to dispatch using the selected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) path before switching to the next path.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) If not given, internal default is used. To check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) the default value, see the activated table.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) <relative_throughput>:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) The relative throughput value of the path
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) among all paths in the path-group.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) The valid range is 0-100.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) If not given, minimum value '1' is used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) If '0' is given, the path isn't selected while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) other paths having a positive value are available.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) Status for each path:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) <status> <fail-count> <in-flight-size> <relative_throughput>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) <status>:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) 'A' if the path is active, 'F' if the path is failed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) <fail-count>:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) The number of path failures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) <in-flight-size>:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) The size of in-flight I/Os on the path.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) <relative_throughput>:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) The relative throughput value of the path
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) among all paths in the path-group.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) Algorithm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) dm-service-time adds the I/O size to 'in-flight-size' when the I/O is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) dispatched and subtracts when completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) Basically, dm-service-time selects a path having minimum service time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) which is calculated by::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) ('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) However, some optimizations below are used to reduce the calculation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) as much as possible.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) 1. If the paths have the same 'relative_throughput', skip
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) the division and just compare the 'in-flight-size'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) 2. If the paths have the same 'in-flight-size', skip the division
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) and just compare the 'relative_throughput'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) 3. If some paths have non-zero 'relative_throughput' and others
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) have zero 'relative_throughput', ignore those paths with zero
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) 'relative_throughput'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) If such optimizations can't be applied, calculate service time, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) compare service time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) If calculated service time is equal, the path having maximum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) 'relative_throughput' may be better. So compare 'relative_throughput'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) then.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) Examples
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) In case that 2 paths (sda and sdb) are used with repeat_count == 128
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) and sda has an average throughput 1GB/s and sdb has 4GB/s,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) 'relative_throughput' value may be '1' for sda and '4' for sdb::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) # echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) dmsetup create test
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) # dmsetup table
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) # dmsetup status
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) Or '2' for sda and '8' for sdb would be also true::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) # echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) dmsetup create test
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) # dmsetup table
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) # dmsetup status
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8