^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) HOWTO for multiqueue network device support
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ===========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Section 1: Base driver requirements for implementing multiqueue support
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) =======================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Intro: Kernel support for multiqueue devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) ---------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) Kernel support for multiqueue devices is always present.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) Base drivers are required to use the new alloc_etherdev_mq() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) alloc_netdev_mq() functions to allocate the subqueues for the device. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) underlying kernel API will take care of the allocation and deallocation of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) the subqueue memory, as well as netdev configuration of where the queues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) exist in memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) The base driver will also need to manage the queues as it does the global
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) netdev->queue_lock today. Therefore base drivers should use the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) netif_{start|stop|wake}_subqueue() functions to manage each queue while the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) device is still operational. netdev->queue_lock is still used when the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) comes online or when it's completely shut down (unregister_netdev(), etc.).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) Section 2: Qdisc support for multiqueue devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) ===============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) Currently two qdiscs are optimized for multiqueue devices. The first is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) default pfifo_fast qdisc. This qdisc supports one qdisc per hardware queue.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) qdisc is responsible for classifying the skb's and then directing the skb's to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) bands and queues based on the value in skb->queue_mapping. Use this field in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) the base driver to determine which queue to send the skb to.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) sch_multiq has been added for hardware that wishes to avoid head-of-line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) blocking. It will cycle though the bands and verify that the hardware queue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) associated with the band is not stopped prior to dequeuing a packet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) On qdisc load, the number of bands is based on the number of queues on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) hardware. Once the association is made, any skb with skb->queue_mapping set,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) will be queued to the band associated with the hardware queue.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) Section 3: Brief howto using MULTIQ for multiqueue devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) ==========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) The userspace command 'tc,' part of the iproute2 package, is used to configure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) qdiscs. To add the MULTIQ qdisc to your network device, assuming the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) is called eth0, run the following command::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) # tc qdisc add dev eth0 root handle 1: multiq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) The qdisc will allocate the number of bands to equal the number of queues that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) the device reports, and bring the qdisc online. Assuming eth0 has 4 Tx
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) queues, the band mapping would look like::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) band 0 => queue 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) band 1 => queue 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) band 2 => queue 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) band 3 => queue 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) Traffic will begin flowing through each queue based on either the simple_tx_hash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) function or based on netdev->select_queue() if you have it defined.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) The behavior of tc filters remains the same. However a new tc action,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) skbedit, has been added. Assuming you wanted to route all traffic to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) specific host, for example 192.168.0.3, through a specific queue you could use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) this action and establish a filter such as::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) match ip dst 192.168.0.3 \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) action skbedit queue_mapping 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) :Author: Alexander Duyck <alexander.h.duyck@intel.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) :Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com>