Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) IP over InfiniBand
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5)   The ib_ipoib driver is an implementation of the IP over InfiniBand
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6)   protocol as specified by RFC 4391 and 4392, issued by the IETF ipoib
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7)   working group.  It is a "native" implementation in the sense of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8)   setting the interface type to ARPHRD_INFINIBAND and the hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9)   address length to 20 (earlier proprietary implementations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10)   masqueraded to the kernel as ethernet interfaces).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) Partitions and P_Keys
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15)   When the IPoIB driver is loaded, it creates one interface for each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16)   port using the P_Key at index 0.  To create an interface with a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17)   different P_Key, write the desired P_Key into the main interface's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18)   /sys/class/net/<intf name>/create_child file.  For example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20)     echo 0x8001 > /sys/class/net/ib0/create_child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22)   This will create an interface named ib0.8001 with P_Key 0x8001.  To
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23)   remove a subinterface, use the "delete_child" file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25)     echo 0x8001 > /sys/class/net/ib0/delete_child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27)   The P_Key for any interface is given by the "pkey" file, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28)   main interface for a subinterface is in "parent."
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30)   Child interface create/delete can also be done using IPoIB's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31)   rtnl_link_ops, where children created using either way behave the same.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) Datagram vs Connected modes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)   The IPoIB driver supports two modes of operation: datagram and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37)   connected.  The mode is set and read through an interface's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38)   /sys/class/net/<intf name>/mode file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40)   In datagram mode, the IB UD (Unreliable Datagram) transport is used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41)   and so the interface MTU has is equal to the IB L2 MTU minus the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42)   IPoIB encapsulation header (4 bytes).  For example, in a typical IB
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43)   fabric with a 2K MTU, the IPoIB MTU will be 2048 - 4 = 2044 bytes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45)   In connected mode, the IB RC (Reliable Connected) transport is used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46)   Connected mode takes advantage of the connected nature of the IB
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47)   transport and allows an MTU up to the maximal IP packet size of 64K,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48)   which reduces the number of IP packets needed for handling large UDP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49)   datagrams, TCP segments, etc and increases the performance for large
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50)   messages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52)   In connected mode, the interface's UD QP is still used for multicast
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53)   and communication with peers that don't support connected mode. In
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54)   this case, RX emulation of ICMP PMTU packets is used to cause the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55)   networking stack to use the smaller UD MTU for these neighbours.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) Stateless offloads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60)   If the IB HW supports IPoIB stateless offloads, IPoIB advertises
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61)   TCP/IP checksum and/or Large Send (LSO) offloading capability to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62)   network stack.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64)   Large Receive (LRO) offloading is also implemented and may be turned
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65)   on/off using ethtool calls.  Currently LRO is supported only for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66)   checksum offload capable devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68)   Stateless offloads are supported only in datagram mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) Interrupt moderation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73)   If the underlying IB device supports CQ event moderation, one can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74)   use ethtool to set interrupt mitigation parameters and thus reduce
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75)   the overhead incurred by handling interrupts.  The main code path of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76)   IPoIB doesn't use events for TX completion signaling so only RX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77)   moderation is supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) Debugging Information
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82)   By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83)   to 'y', tracing messages are compiled into the driver.  They are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84)   turned on by setting the module parameters debug_level and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85)   mcast_debug_level to 1.  These parameters can be controlled at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86)   runtime through files in /sys/module/ib_ipoib/.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88)   CONFIG_INFINIBAND_IPOIB_DEBUG also enables files in the debugfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89)   virtual filesystem.  By mounting this filesystem, for example with::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91)     mount -t debugfs none /sys/kernel/debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93)   it is possible to get statistics about multicast groups from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94)   files /sys/kernel/debug/ipoib/ib0_mcg and so on.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96)   The performance impact of this option is negligible, so it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97)   is safe to enable this option with debug_level set to 0 for normal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98)   operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)   CONFIG_INFINIBAND_IPOIB_DEBUG_DATA enables even more debug output in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)   the data path when data_debug_level is set to 1.  However, even with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)   the output disabled, enabling this configuration option will affect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)   performance, because it adds tests to the fast path.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) References
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)   Transmission of IP over InfiniBand (IPoIB) (RFC 4391)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)     http://ietf.org/rfc/rfc4391.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)   IP over InfiniBand (IPoIB) Architecture (RFC 4392)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)     http://ietf.org/rfc/rfc4392.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)   IP over InfiniBand: Connected Mode (RFC 4755)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)     http://ietf.org/rfc/rfc4755.txt