Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) Segmentation Offloads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) This document describes a set of techniques in the Linux networking stack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) to take advantage of segmentation offload capabilities of various NICs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) The following technologies are described:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15)  * TCP Segmentation Offload - TSO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16)  * UDP Fragmentation Offload - UFO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17)  * IPIP, SIT, GRE, and UDP Tunnel Offloads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18)  * Generic Segmentation Offload - GSO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19)  * Generic Receive Offload - GRO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20)  * Partial Generic Segmentation Offload - GSO_PARTIAL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21)  * SCTP acceleration with GSO - GSO_BY_FRAGS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) TCP Segmentation Offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) TCP segmentation allows a device to segment a single frame into multiple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) frames with a data payload size specified in skb_shinfo()->gso_size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) When TCP segmentation requested the bit for either SKB_GSO_TCPV4 or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) SKB_GSO_TCPV6 should be set in skb_shinfo()->gso_type and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) skb_shinfo()->gso_size should be set to a non-zero value.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) TCP segmentation is dependent on support for the use of partial checksum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) offload.  For this reason TSO is normally disabled if the Tx checksum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) offload for a given device is disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) In order to support TCP segmentation offload it is necessary to populate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) the network and transport header offsets of the skbuff so that the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) drivers will be able determine the offsets of the IP or IPv6 header and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) TCP header.  In addition as CHECKSUM_PARTIAL is required csum_start should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) also point to the TCP header of the packet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) For IPv4 segmentation we support one of two types in terms of the IP ID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) The default behavior is to increment the IP ID with every segment.  If the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) GSO type SKB_GSO_TCP_FIXEDID is specified then we will not increment the IP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) ID and all segments will use the same IP ID.  If a device has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) NETIF_F_TSO_MANGLEID set then the IP ID can be ignored when performing TSO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) and we will either increment the IP ID for all frames, or leave it at a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) static value based on driver preference.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) UDP Fragmentation Offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) UDP fragmentation offload allows a device to fragment an oversized UDP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) datagram into multiple IPv4 fragments.  Many of the requirements for UDP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) fragmentation offload are the same as TSO.  However the IPv4 ID for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) fragments should not increment as a single IPv4 datagram is fragmented.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) UFO is deprecated: modern kernels will no longer generate UFO skbs, but can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) still receive them from tuntap and similar devices. Offload of UDP-based
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) tunnel protocols is still supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) IPIP, SIT, GRE, UDP Tunnel, and Remote Checksum Offloads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) ========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) In addition to the offloads described above it is possible for a frame to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) contain additional headers such as an outer tunnel.  In order to account
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) for such instances an additional set of segmentation offload types were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) introduced including SKB_GSO_IPXIP4, SKB_GSO_IPXIP6, SKB_GSO_GRE, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) SKB_GSO_UDP_TUNNEL.  These extra segmentation types are used to identify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) cases where there are more than just 1 set of headers.  For example in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) case of IPIP and SIT we should have the network and transport headers moved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) from the standard list of headers to "inner" header offsets.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) Currently only two levels of headers are supported.  The convention is to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) refer to the tunnel headers as the outer headers, while the encapsulated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) data is normally referred to as the inner headers.  Below is the list of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) calls to access the given headers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) IPIP/SIT Tunnel::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84)              Outer                  Inner
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85)   MAC        skb_mac_header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86)   Network    skb_network_header     skb_inner_network_header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87)   Transport  skb_transport_header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) UDP/GRE Tunnel::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91)              Outer                  Inner
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92)   MAC        skb_mac_header         skb_inner_mac_header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93)   Network    skb_network_header     skb_inner_network_header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94)   Transport  skb_transport_header   skb_inner_transport_header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) In addition to the above tunnel types there are also SKB_GSO_GRE_CSUM and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) SKB_GSO_UDP_TUNNEL_CSUM.  These two additional tunnel types reflect the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) fact that the outer header also requests to have a non-zero checksum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) included in the outer header.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) Finally there is SKB_GSO_TUNNEL_REMCSUM which indicates that a given tunnel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) header has requested a remote checksum offload.  In this case the inner
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) headers will be left with a partial checksum and only the outer header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) checksum will be computed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) Generic Segmentation Offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) Generic segmentation offload is a pure software offload that is meant to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) deal with cases where device drivers cannot perform the offloads described
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) above.  What occurs in GSO is that a given skbuff will have its data broken
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) out over multiple skbuffs that have been resized to match the MSS provided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) via skb_shinfo()->gso_size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) Before enabling any hardware segmentation offload a corresponding software
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) offload is required in GSO.  Otherwise it becomes possible for a frame to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) be re-routed between devices and end up being unable to be transmitted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) Generic Receive Offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Generic receive offload is the complement to GSO.  Ideally any frame
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) assembled by GRO should be segmented to create an identical sequence of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) frames using GSO, and any sequence of frames segmented by GSO should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) able to be reassembled back to the original by GRO.  The only exception to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) this is IPv4 ID in the case that the DF bit is set for a given IP header.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) If the value of the IPv4 ID is not sequentially incrementing it will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) altered so that it is when a frame assembled via GRO is segmented via GSO.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) Partial Generic Segmentation Offload
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) ====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) Partial generic segmentation offload is a hybrid between TSO and GSO.  What
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) it effectively does is take advantage of certain traits of TCP and tunnels
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) so that instead of having to rewrite the packet headers for each segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) only the inner-most transport header and possibly the outer-most network
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) header need to be updated.  This allows devices that do not support tunnel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) offloads or tunnel offloads with checksum to still make use of segmentation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) With the partial offload what occurs is that all headers excluding the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) inner transport header are updated such that they will contain the correct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) values for if the header was simply duplicated.  The one exception to this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) is the outer IPv4 ID field.  It is up to the device drivers to guarantee
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) that the IPv4 ID field is incremented in the case that a given header does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) not have the DF bit set.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) SCTP acceleration with GSO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) SCTP - despite the lack of hardware support - can still take advantage of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) GSO to pass one large packet through the network stack, rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) multiple small packets.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) This requires a different approach to other offloads, as SCTP packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) cannot be just segmented to (P)MTU. Rather, the chunks must be contained in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) IP segments, padding respected. So unlike regular GSO, SCTP can't just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) generate a big skb, set gso_size to the fragmentation point and deliver it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) to IP layer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) Instead, the SCTP protocol layer builds an skb with the segments correctly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) padded and stored as chained skbs, and skb_segment() splits based on those.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) To signal this, gso_size is set to the special value GSO_BY_FRAGS.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) Therefore, any code in the core networking stack must be aware of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) possibility that gso_size will be GSO_BY_FRAGS and handle that case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) appropriately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) There are some helpers to make this easier:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) - skb_is_gso(skb) && skb_is_gso_sctp(skb) is the best way to see if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)   an skb is an SCTP GSO skb.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) - For size checks, the skb_gso_validate_*_len family of helpers correctly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178)   considers GSO_BY_FRAGS.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) - For manipulating packets, skb_increase_gso_size and skb_decrease_gso_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)   will check for GSO_BY_FRAGS and WARN if asked to manipulate these skbs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) This also affects drivers with the NETIF_F_FRAGLIST & NETIF_F_GSO_SCTP bits
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) set. Note also that NETIF_F_GSO_SCTP is included in NETIF_F_GSO_SOFTWARE.