Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) dm-switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) The device-mapper switch target creates a device that supports an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) arbitrary mapping of fixed-size regions of I/O across a fixed set of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) paths.  The path used for any specific region can be switched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) dynamically by sending the target a message.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) It maps I/O to underlying block devices efficiently when there is a large
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) number of fixed-sized address regions but there is no simple pattern
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) that would allow for a compact representation of the mapping such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) dm-stripe.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) Background
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) Dell EqualLogic and some other iSCSI storage arrays use a distributed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) frameless architecture.  In this architecture, the storage group
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) consists of a number of distinct storage arrays ("members") each having
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) independent controllers, disk storage and network adapters.  When a LUN
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) is created it is spread across multiple members.  The details of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) spreading are hidden from initiators connected to this storage system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) The storage group exposes a single target discovery portal, no matter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) how many members are being used.  When iSCSI sessions are created, each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) session is connected to an eth port on a single member.  Data to a LUN
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) can be sent on any iSCSI session, and if the blocks being accessed are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) stored on another member the I/O will be forwarded as required.  This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) forwarding is invisible to the initiator.  The storage layout is also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) dynamic, and the blocks stored on disk may be moved from member to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) member as needed to balance the load.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) This architecture simplifies the management and configuration of both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) the storage group and initiators.  In a multipathing configuration, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) is possible to set up multiple iSCSI sessions to use multiple network
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) interfaces on both the host and target to take advantage of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) increased network bandwidth.  An initiator could use a simple round
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) robin algorithm to send I/O across all paths and let the storage array
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) members forward it as necessary, but there is a performance advantage to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) sending data directly to the correct member.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) A device-mapper table already lets you map different regions of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) device onto different targets.  However in this architecture the LUN is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) spread with an address region size on the order of 10s of MBs, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) means the resulting table could have more than a million entries and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) consume far too much memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) Using this device-mapper switch target we can now build a two-layer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) device hierarchy:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51)     Upper Tier - Determine which array member the I/O should be sent to.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52)     Lower Tier - Load balance amongst paths to a particular member.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) The lower tier consists of a single dm multipath device for each member.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) Each of these multipath devices contains the set of paths directly to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) the array member in one priority group, and leverages existing path
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) selectors to load balance amongst these paths.  We also build a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) non-preferred priority group containing paths to other array members for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) failover reasons.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) The upper tier consists of a single dm-switch device.  This device uses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) a bitmap to look up the location of the I/O and choose the appropriate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) lower tier device to route the I/O.  By using a bitmap we are able to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) use 4 bits for each address range in a 16 member group (which is very
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) large for us).  This is a much denser representation than the dm table
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) b-tree can achieve.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) Construction Parameters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71)     <num_paths> <region_size> <num_optional_args> [<optional_args>...] [<dev_path> <offset>]+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) 	<num_paths>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 	    The number of paths across which to distribute the I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 	<region_size>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 	    The number of 512-byte sectors in a region. Each region can be redirected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 	    to any of the available paths.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 	<num_optional_args>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 	    The number of optional arguments. Currently, no optional arguments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 	    are supported and so this must be zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 	<dev_path>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 	    The block device that represents a specific path to the device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) 	<offset>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 	    The offset of the start of data on the specific <dev_path> (in units
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 	    of 512-byte sectors). This number is added to the sector number when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 	    forwarding the request to the specific path. Typically it is zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) Messages
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) set_region_mappings <index>:<path_nr> [<index>]:<path_nr> [<index>]:<path_nr>...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) Modify the region table by specifying which regions are redirected to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) which paths.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) <index>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)     The region number (region size was specified in constructor parameters).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)     If index is omitted, the next region (previous index + 1) is used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)     Expressed in hexadecimal (WITHOUT any prefix like 0x).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) <path_nr>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)     The path number in the range 0 ... (<num_paths> - 1).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)     Expressed in hexadecimal (WITHOUT any prefix like 0x).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) R<n>,<m>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)     This parameter allows repetitive patterns to be loaded quickly. <n> and <m>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)     are hexadecimal numbers. The last <n> mappings are repeated in the next <m>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)     slots.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) Status
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) ======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) No status line is reported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) =======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) Assume that you have volumes vg1/switch0 vg1/switch1 vg1/switch2 with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) the same size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Create a switch device with 64kB region size::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)     dmsetup create switch --table "0 `blockdev --getsz /dev/vg1/switch0`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 	switch 3 128 0 /dev/vg1/switch0 0 /dev/vg1/switch1 0 /dev/vg1/switch2 0"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) Set mappings for the first 7 entries to point to devices switch0, switch1,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) switch2, switch0, switch1, switch2, switch1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)     dmsetup message switch 0 set_region_mappings 0:0 :1 :2 :0 :1 :2 :1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) Set repetitive mapping. This command::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)     dmsetup message switch 0 set_region_mappings 1000:1 :2 R2,10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) is equivalent to::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)     dmsetup message switch 0 set_region_mappings 1000:1 :2 :1 :2 :1 :2 :1 :2 \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 	:1 :2 :1 :2 :1 :2 :1 :2 :1 :2