Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ======================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) Net DIM - Generic Network Dynamic Interrupt Moderation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ======================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) :Author: Tal Gilboa <talgi@mellanox.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) .. contents:: :depth: 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) Assumptions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) This document assumes the reader has basic knowledge in network drivers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) and in general interrupt moderation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) interrupt moderation configuration of a channel in order to optimize packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) processing. The mechanism includes an algorithm which decides if and how to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) change moderation parameters for a channel, usually by performing an analysis on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) runtime data sampled from the system. Net DIM is such a mechanism. In each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) iteration of the algorithm, it analyses a given sample of the data, compares it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) to the previous sample and if required, it can decide to change some of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) interrupt moderation configuration fields. The data sample is composed of data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) bandwidth, the number of packets and the number of events. The time between
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) samples is also measured. Net DIM compares the current and the previous data and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) returns an adjusted interrupt moderation configuration object. In some cases,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) the algorithm might decide not to change anything. The configuration fields are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) the minimum duration (microseconds) allowed between events and the maximum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) number of wanted packets per event. The Net DIM algorithm ascribes importance to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) increase bandwidth over reducing interrupt rate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) Net DIM Algorithm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) Each iteration of the Net DIM algorithm follows these steps:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) #. Calculates new data sample.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) #. Compares it to previous sample.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) #. Makes a decision - suggests interrupt moderation configuration fields.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) #. Applies a schedule work function, which applies suggested configuration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) The first two steps are straightforward, both the new and the previous data are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) supplied by the driver registered to Net DIM. The previous data is the new data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) supplied to the previous iteration. The comparison step checks the difference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) between the new and previous data and decides on the result of the last step.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) A step would result as "better" if bandwidth increases and as "worse" if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) bandwidth reduces. If there is no change in bandwidth, the packet rate is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) compared in a similar fashion - increase == "better" and decrease == "worse".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) In case there is no change in the packet rate as well, the interrupt rate is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) compared. Here the algorithm tries to optimize for lower interrupt rate so an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) increase in the interrupt rate is considered "worse" and a decrease is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) considered "better". Step #2 has an optimization for avoiding false results: it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) only considers a difference between samples as valid if it is greater than a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) certain percentage. Also, since Net DIM does not measure anything by itself, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) assumes the data provided by the driver is valid.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) Step #3 decides on the suggested configuration based on the result from step #2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) and the internal state of the algorithm. The states reflect the "direction" of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) the algorithm: is it going left (reducing moderation), right (increasing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) moderation) or standing still. Another optimization is that if a decision
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) to stay still is made multiple times, the interval between iterations of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) algorithm would increase in order to reduce calculation overhead. Also, after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) "parking" on one of the most left or most right decisions, the algorithm may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) decide to verify this decision by taking a step in the other direction. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) done in order to avoid getting stuck in a "deep sleep" scenario. Once a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) decision is made, an interrupt moderation configuration is selected from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) the predefined profiles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) The last step is to notify the registered driver that it should apply the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) suggested configuration. This is done by scheduling a work function, defined by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) the Net DIM API and provided by the registered driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) As you can see, Net DIM itself does not actively interact with the system. It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) would have trouble making the correct decisions if the wrong data is supplied to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) it and it would be useless if the work function would not apply the suggested
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) configuration. This does, however, allow the registered driver some room for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) manoeuvre as it may provide partial data or ignore the algorithm suggestion
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) under some conditions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) Registering a Network Device to DIM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) Net DIM API exposes the main function net_dim().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) This function is the entry point to the Net
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) DIM algorithm and has to be called every time the driver would like to check if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) it should change interrupt moderation parameters. The driver should provide two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) data structures: :c:type:`struct dim <dim>` and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) :c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) describes the state of DIM for a specific object (RX queue, TX queue,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) other queues, etc.). This includes the current selected profile, previous data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) samples, the callback function provided by the driver and more.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) :c:type:`struct dim_sample <dim_sample>` describes a data sample,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) which will be compared to the data sample stored in :c:type:`struct dim <dim>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) in order to decide on the algorithm's next
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) step. The sample should include bytes, packets and interrupts, measured by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) the driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) In order to use Net DIM from a networking driver, the driver needs to call the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) main net_dim() function. The recommended method is to call net_dim() on each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) interrupt. Since Net DIM has a built-in moderation and it might decide to skip
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) iterations under certain conditions, there is no need to moderate the net_dim()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) calls as well. As mentioned above, the driver needs to provide an object of type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) :c:type:`struct dim <dim>` to the net_dim() function call. It is advised for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) each entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) data structure and use it as the main Net DIM API object.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) The :c:type:`struct dim_sample <dim_sample>` should hold the latest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) bytes, packets and interrupts count. No need to perform any calculations, just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) include the raw data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) The net_dim() call itself does not return anything. Instead Net DIM relies on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) the driver to provide a callback function, which is called when the algorithm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) decides to make a change in the interrupt moderation parameters. This callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) will be scheduled and run in a separate thread in order not to add overhead to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) the data flow. After the work is done, Net DIM algorithm needs to be set to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) the proper state in order to move to the next iteration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) =======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) The following code demonstrates how to register a driver to Net DIM. The actual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) usage is not complete but it should make the outline of the usage clear.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) .. code-block:: c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)   #include <linux/dim.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)   /* Callback for net DIM to schedule on a decision to change moderation */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)   void my_driver_do_dim_work(struct work_struct *work)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)   {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 	/* Get struct dim from struct work_struct */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 	struct dim *dim = container_of(work, struct dim,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 				       work);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 	/* Do interrupt moderation related stuff */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 	/* Signal net DIM work is done and it should move to next iteration */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 	dim->state = DIM_START_MEASURE;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)   }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)   /* My driver's interrupt handler */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)   int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)   {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 	/* A struct to hold current measured data */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 	struct dim_sample dim_sample;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 	/* Initiate data sample struct with current data */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 	dim_update_sample(my_entity->events,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 		          my_entity->packets,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 		          my_entity->bytes,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 		          &dim_sample);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 	/* Call net DIM */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 	net_dim(&my_entity->dim, dim_sample);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)   }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)   /* My entity's initialization function (my_entity was already allocated) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164)   int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)   {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) 	/* Initiate struct work_struct with my driver's callback function */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) 	INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)   }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) Dynamic Interrupt Moderation (DIM) library API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) ==============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) .. kernel-doc:: include/linux/dim.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)     :internal: