^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ======================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Net DIM - Generic Network Dynamic Interrupt Moderation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ======================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) :Author: Tal Gilboa <talgi@mellanox.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) .. contents:: :depth: 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) Assumptions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) This document assumes the reader has basic knowledge in network drivers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) and in general interrupt moderation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) Dynamic Interrupt Moderation (DIM) (in networking) refers to changing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) interrupt moderation configuration of a channel in order to optimize packet
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) processing. The mechanism includes an algorithm which decides if and how to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) change moderation parameters for a channel, usually by performing an analysis on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) runtime data sampled from the system. Net DIM is such a mechanism. In each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) iteration of the algorithm, it analyses a given sample of the data, compares it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) to the previous sample and if required, it can decide to change some of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) interrupt moderation configuration fields. The data sample is composed of data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) bandwidth, the number of packets and the number of events. The time between
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) samples is also measured. Net DIM compares the current and the previous data and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) returns an adjusted interrupt moderation configuration object. In some cases,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) the algorithm might decide not to change anything. The configuration fields are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) the minimum duration (microseconds) allowed between events and the maximum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) number of wanted packets per event. The Net DIM algorithm ascribes importance to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) increase bandwidth over reducing interrupt rate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) Net DIM Algorithm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) Each iteration of the Net DIM algorithm follows these steps:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) #. Calculates new data sample.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) #. Compares it to previous sample.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) #. Makes a decision - suggests interrupt moderation configuration fields.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) #. Applies a schedule work function, which applies suggested configuration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) The first two steps are straightforward, both the new and the previous data are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) supplied by the driver registered to Net DIM. The previous data is the new data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) supplied to the previous iteration. The comparison step checks the difference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) between the new and previous data and decides on the result of the last step.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) A step would result as "better" if bandwidth increases and as "worse" if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) bandwidth reduces. If there is no change in bandwidth, the packet rate is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) compared in a similar fashion - increase == "better" and decrease == "worse".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) In case there is no change in the packet rate as well, the interrupt rate is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) compared. Here the algorithm tries to optimize for lower interrupt rate so an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) increase in the interrupt rate is considered "worse" and a decrease is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) considered "better". Step #2 has an optimization for avoiding false results: it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) only considers a difference between samples as valid if it is greater than a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) certain percentage. Also, since Net DIM does not measure anything by itself, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) assumes the data provided by the driver is valid.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) Step #3 decides on the suggested configuration based on the result from step #2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) and the internal state of the algorithm. The states reflect the "direction" of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) the algorithm: is it going left (reducing moderation), right (increasing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) moderation) or standing still. Another optimization is that if a decision
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) to stay still is made multiple times, the interval between iterations of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) algorithm would increase in order to reduce calculation overhead. Also, after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) "parking" on one of the most left or most right decisions, the algorithm may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) decide to verify this decision by taking a step in the other direction. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) done in order to avoid getting stuck in a "deep sleep" scenario. Once a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) decision is made, an interrupt moderation configuration is selected from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) the predefined profiles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) The last step is to notify the registered driver that it should apply the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) suggested configuration. This is done by scheduling a work function, defined by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) the Net DIM API and provided by the registered driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) As you can see, Net DIM itself does not actively interact with the system. It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) would have trouble making the correct decisions if the wrong data is supplied to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) it and it would be useless if the work function would not apply the suggested
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) configuration. This does, however, allow the registered driver some room for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) manoeuvre as it may provide partial data or ignore the algorithm suggestion
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) under some conditions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) Registering a Network Device to DIM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) Net DIM API exposes the main function net_dim().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) This function is the entry point to the Net
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) DIM algorithm and has to be called every time the driver would like to check if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) it should change interrupt moderation parameters. The driver should provide two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) data structures: :c:type:`struct dim <dim>` and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) :c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) describes the state of DIM for a specific object (RX queue, TX queue,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) other queues, etc.). This includes the current selected profile, previous data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) samples, the callback function provided by the driver and more.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) :c:type:`struct dim_sample <dim_sample>` describes a data sample,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) which will be compared to the data sample stored in :c:type:`struct dim <dim>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) in order to decide on the algorithm's next
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) step. The sample should include bytes, packets and interrupts, measured by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) the driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) In order to use Net DIM from a networking driver, the driver needs to call the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) main net_dim() function. The recommended method is to call net_dim() on each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) interrupt. Since Net DIM has a built-in moderation and it might decide to skip
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) iterations under certain conditions, there is no need to moderate the net_dim()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) calls as well. As mentioned above, the driver needs to provide an object of type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) :c:type:`struct dim <dim>` to the net_dim() function call. It is advised for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) each entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) data structure and use it as the main Net DIM API object.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) The :c:type:`struct dim_sample <dim_sample>` should hold the latest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) bytes, packets and interrupts count. No need to perform any calculations, just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) include the raw data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) The net_dim() call itself does not return anything. Instead Net DIM relies on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) the driver to provide a callback function, which is called when the algorithm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) decides to make a change in the interrupt moderation parameters. This callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) will be scheduled and run in a separate thread in order not to add overhead to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) the data flow. After the work is done, Net DIM algorithm needs to be set to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) the proper state in order to move to the next iteration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) =======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) The following code demonstrates how to register a driver to Net DIM. The actual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) usage is not complete but it should make the outline of the usage clear.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) .. code-block:: c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) #include <linux/dim.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) /* Callback for net DIM to schedule on a decision to change moderation */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) void my_driver_do_dim_work(struct work_struct *work)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) /* Get struct dim from struct work_struct */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) struct dim *dim = container_of(work, struct dim,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) work);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) /* Do interrupt moderation related stuff */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) /* Signal net DIM work is done and it should move to next iteration */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) dim->state = DIM_START_MEASURE;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) /* My driver's interrupt handler */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) /* A struct to hold current measured data */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) struct dim_sample dim_sample;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) /* Initiate data sample struct with current data */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) dim_update_sample(my_entity->events,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) my_entity->packets,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) my_entity->bytes,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) &dim_sample);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) /* Call net DIM */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) net_dim(&my_entity->dim, dim_sample);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) /* My entity's initialization function (my_entity was already allocated) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) /* Initiate struct work_struct with my driver's callback function */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) Dynamic Interrupt Moderation (DIM) library API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) ==============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) .. kernel-doc:: include/linux/dim.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) :internal: