Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) Open vSwitch datapath developer documentation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) =============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) The Open vSwitch kernel module allows flexible userspace control over
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) flow-level packet processing on selected network devices.  It can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) used to implement a plain Ethernet switch, network device bonding,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) VLAN processing, network access control, flow-based network control,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) and so on.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) The kernel module implements multiple "datapaths" (analogous to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) bridges), each of which can have multiple "vports" (analogous to ports
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) within a bridge).  Each datapath also has associated with it a "flow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) table" that userspace populates with "flows" that map from keys based
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) on packet headers and metadata to sets of actions.  The most common
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) action forwards the packet to another vport; other actions are also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) implemented.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) When a packet arrives on a vport, the kernel module processes it by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) extracting its flow key and looking it up in the flow table.  If there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) is a matching flow, it executes the associated actions.  If there is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) no match, it queues the packet to userspace for processing (as part of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) its processing, userspace will likely set up a flow to handle further
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) packets of the same type entirely in-kernel).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) Flow key compatibility
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) Network protocols evolve over time.  New protocols become important
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) and existing protocols lose their prominence.  For the Open vSwitch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) kernel module to remain relevant, it must be possible for newer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) versions to parse additional protocols as part of the flow key.  It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) might even be desirable, someday, to drop support for parsing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) protocols that have become obsolete.  Therefore, the Netlink interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) to Open vSwitch is designed to allow carefully written userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) applications to work with any version of the flow key, past or future.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) To support this forward and backward compatibility, whenever the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) kernel module passes a packet to userspace, it also passes along the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) flow key that it parsed from the packet.  Userspace then extracts its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) own notion of a flow key from the packet and compares it against the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) kernel-provided version:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47)     - If userspace's notion of the flow key for the packet matches the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48)       kernel's, then nothing special is necessary.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50)     - If the kernel's flow key includes more fields than the userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51)       version of the flow key, for example if the kernel decoded IPv6
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52)       headers but userspace stopped at the Ethernet type (because it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53)       does not understand IPv6), then again nothing special is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54)       necessary.  Userspace can still set up a flow in the usual way,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55)       as long as it uses the kernel-provided flow key to do it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57)     - If the userspace flow key includes more fields than the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58)       kernel's, for example if userspace decoded an IPv6 header but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59)       the kernel stopped at the Ethernet type, then userspace can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60)       forward the packet manually, without setting up a flow in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61)       kernel.  This case is bad for performance because every packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62)       that the kernel considers part of the flow must go to userspace,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63)       but the forwarding behavior is correct.  (If userspace can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64)       determine that the values of the extra fields would not affect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65)       forwarding behavior, then it could set up a flow anyway.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) How flow keys evolve over time is important to making this work, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) the following sections go into detail.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) Flow key format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) A flow key is passed over a Netlink socket as a sequence of Netlink
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) attributes.  Some attributes represent packet metadata, defined as any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) information about a packet that cannot be extracted from the packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) itself, e.g. the vport on which the packet was received.  Most
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) attributes, however, are extracted from headers within the packet,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) e.g. source and destination addresses from Ethernet, IP, or TCP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) headers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) The <linux/openvswitch.h> header file defines the exact format of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) flow key attributes.  For informal explanatory purposes here, we write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) them as comma-separated strings, with parentheses indicating arguments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) and nesting.  For example, the following could represent a flow key
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) corresponding to a TCP packet that arrived on vport 1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88)     in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89)     eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90)     frag=no), tcp(src=49163, dst=80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) Often we ellipsize arguments not important to the discussion, e.g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94)     in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) Wildcarded flow key format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) A wildcarded flow is described with two sequences of Netlink attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) passed over the Netlink socket. A flow key, exactly as described above, and an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) optional corresponding flow mask.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) A wildcarded flow can represent a group of exact match flows. Each '1' bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) in the mask specifies a exact match with the corresponding bit in the flow key.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) A '0' bit specifies a don't care bit, which will match either a '1' or '0' bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) of a incoming packet. Using wildcarded flow can improve the flow set up rate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) by reduce the number of new flows need to be processed by the user space program.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) Support for the mask Netlink attribute is optional for both the kernel and user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) space program. The kernel can ignore the mask attribute, installing an exact
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) match flow, or reduce the number of don't care bits in the kernel to less than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) what was specified by the user space program. In this case, variations in bits
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) that the kernel does not implement will simply result in additional flow setups.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) The kernel module will also work with user space programs that neither support
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) nor supply flow mask attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) Since the kernel may ignore or modify wildcard bits, it can be difficult for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) the userspace program to know exactly what matches are installed. There are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) two possible approaches: reactively install flows as they miss the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) flow table (and therefore not attempt to determine wildcard changes at all)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) or use the kernel's response messages to determine the installed wildcards.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) When interacting with userspace, the kernel should maintain the match portion
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) of the key exactly as originally installed. This will provides a handle to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) identify the flow for all future operations. However, when reporting the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) mask of an installed flow, the mask should include any restrictions imposed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) by the kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) The behavior when using overlapping wildcarded flows is undefined. It is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) responsibility of the user space program to ensure that any incoming packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) can match at most one flow, wildcarded or not. The current implementation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) performs best-effort detection of overlapping wildcarded flows and may reject
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) some but not all of them. However, this behavior may change in future versions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) Unique flow identifiers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) An alternative to using the original match portion of a key as the handle for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) flow identification is a unique flow identifier, or "UFID". UFIDs are optional
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) for both the kernel and user space program.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) User space programs that support UFID are expected to provide it during flow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) setup in addition to the flow, then refer to the flow using the UFID for all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) future operations. The kernel is not required to index flows by the original
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) flow key if a UFID is specified.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) Basic rule for evolving flow keys
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) ---------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) Some care is needed to really maintain forward and backward
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) compatibility for applications that follow the rules listed under
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) "Flow key compatibility" above.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) The basic rule is obvious::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159)     ==================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)     New network protocol support must only supplement existing flow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)     key attributes.  It must not change the meaning of already defined
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)     flow key attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)     ==================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) This rule does have less-obvious consequences so it is worth working
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) through a few examples.  Suppose, for example, that the kernel module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) did not already implement VLAN parsing.  Instead, it just interpreted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) packet.  The flow key for any packet with an 802.1Q header would look
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) essentially like this, ignoring metadata::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)     eth(...), eth_type(0x8100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) Naively, to add VLAN support, it makes sense to add a new "vlan" flow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) key attribute to contain the VLAN tag, then continue to decode the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) encapsulated headers beyond the VLAN tag using the existing field
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) definitions.  With this change, a TCP packet in VLAN 10 would have a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) flow key much like this::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)     eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) But this change would negatively affect a userspace application that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) has not been updated to understand the new "vlan" flow key attribute.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) The application could, following the flow compatibility rules above,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) ignore the "vlan" attribute that it does not understand and therefore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) assume that the flow contained IP packets.  This is a bad assumption
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) (the flow only contains IP packets if one parses and skips over the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 802.1Q header) and it could cause the application's behavior to change
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) across kernel versions even though it follows the compatibility rules.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) The solution is to use a set of nested attributes.  This is, for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) example, why 802.1Q support uses nested attributes.  A TCP packet in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) VLAN 10 is actually expressed as::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)     eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196)     ip(proto=6, ...), tcp(...)))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) Notice how the "eth_type", "ip", and "tcp" flow key attributes are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) nested inside the "encap" attribute.  Thus, an application that does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) not understand the "vlan" key will not see either of those attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) and therefore will not misinterpret them.  (Also, the outer eth_type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) is still 0x8100, not changed to 0x0800.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) Handling malformed packets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) Don't drop packets in the kernel for malformed protocol headers, bad
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) checksums, etc.  This would prevent userspace from implementing a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) simple Ethernet switch that forwards every packet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) Instead, in such a case, include an attribute with "empty" content.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) It doesn't matter if the empty content could be valid protocol values,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) as long as those values are rarely seen in practice, because userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) can always forward all packets with those values to userspace and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) handle them individually.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) For example, consider a packet that contains an IP header that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) indicates protocol 6 for TCP, but which is truncated just after the IP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) header, so that the TCP header is missing.  The flow key for this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) packet would include a tcp attribute with all-zero src and dst, like
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) this::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223)     eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) As another example, consider a packet with an Ethernet type of 0x8100,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) indicating that a VLAN TCI should follow, but which is truncated just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) after the Ethernet type.  The flow key for this packet would include
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) an all-zero-bits vlan and an empty encap attribute, like this::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)     eth(...), eth_type(0x8100), vlan(0), encap()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) Unlike a TCP packet with source and destination ports 0, an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) attribute expressly to allow this situation to be distinguished.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) Thus, the flow key in this second example unambiguously indicates a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) missing or malformed VLAN TCI.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) Other rules
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) The other rules for flow keys are much less subtle:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244)     - Duplicate attributes are not allowed at a given nesting level.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246)     - Ordering of attributes is not significant.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248)     - When the kernel sends a given flow key to userspace, it always
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249)       composes it the same way.  This allows userspace to hash and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250)       compare entire flow keys that it may not be able to fully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251)       interpret.