Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) =====================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) High resolution timers and dynamic ticks design notes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =====================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) Further information can be found in the paper of the OLS 2006 talk "hrtimers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) and beyond". The paper is part of the OLS 2006 Proceedings Volume 1, which can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) be found on the OLS website:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) https://www.kernel.org/doc/ols/2006/ols2006v1-pages-333-346.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) The slides to this talk are available from:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) http://www.cs.columbia.edu/~nahum/w6998/papers/ols2006-hrtimers-slides.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) The slides contain five figures (pages 2, 15, 18, 20, 22), which illustrate the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) changes in the time(r) related Linux subsystems. Figure #1 (p. 2) shows the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) design of the Linux time(r) system before hrtimers and other building blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) got merged into mainline.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) Note: the paper and the slides are talking about "clock event source", while we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) switched to the name "clock event devices" in meantime.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) The design contains the following basic building blocks:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) - hrtimer base infrastructure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) - timeofday and clock source management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) - clock event management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) - high resolution timer functionality
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) - dynamic ticks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) hrtimer base infrastructure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) The hrtimer base infrastructure was merged into the 2.6.16 kernel. Details of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) the base implementation are covered in Documentation/timers/hrtimers.rst. See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) also figure #2 (OLS slides p. 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) The main differences to the timer wheel, which holds the armed timer_list type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) timers are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40)        - time ordered enqueueing into a rb-tree
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41)        - independent of ticks (the processing is based on nanoseconds)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) timeofday and clock source management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) -------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) John Stultz's Generic Time Of Day (GTOD) framework moves a large portion of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) code out of the architecture-specific areas into a generic management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) framework, as illustrated in figure #3 (OLS slides p. 18). The architecture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) specific portion is reduced to the low level hardware details of the clock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) sources, which are registered in the framework and selected on a quality based
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) decision. The low level code provides hardware setup and readout routines and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) initializes data structures, which are used by the generic time keeping code to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) convert the clock ticks to nanosecond based time values. All other time keeping
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) related functionality is moved into the generic code. The GTOD base patch got
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) merged into the 2.6.18 kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) Further information about the Generic Time Of Day framework is available in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) OLS 2005 Proceedings Volume 1:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 	http://www.linuxsymposium.org/2005/linuxsymposium_procv1.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) The paper "We Are Not Getting Any Younger: A New Approach to Time and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) Timers" was written by J. Stultz, D.V. Hart, & N. Aravamudan.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) Figure #3 (OLS slides p.18) illustrates the transformation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) clock event management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) While clock sources provide read access to the monotonically increasing time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) value, clock event devices are used to schedule the next event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) interrupt(s). The next event is currently defined to be periodic, with its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) period defined at compile time. The setup and selection of the event device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) for various event driven functionalities is hardwired into the architecture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) dependent code. This results in duplicated code across all architectures and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) makes it extremely difficult to change the configuration of the system to use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) event interrupt devices other than those already built into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) architecture. Another implication of the current design is that it is necessary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) to touch all the architecture-specific implementations in order to provide new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) functionality like high resolution timers or dynamic ticks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) The clock events subsystem tries to address this problem by providing a generic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) solution to manage clock event devices and their usage for the various clock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) event driven kernel functionalities. The goal of the clock event subsystem is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) to minimize the clock event related architecture dependent code to the pure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) hardware related handling and to allow easy addition and utilization of new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) clock event devices. It also minimizes the duplicated code across the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) architectures as it provides generic functionality down to the interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) service handler, which is almost inherently hardware dependent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) Clock event devices are registered either by the architecture dependent boot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) code or at module insertion time. Each clock event device fills a data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) structure with clock-specific property parameters and callback functions. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) clock event management decides, by using the specified property parameters, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) set of system functions a clock event device will be used to support. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) includes the distinction of per-CPU and per-system global event devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) System-level global event devices are used for the Linux periodic tick. Per-CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) event devices are used to provide local CPU functionality such as process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) accounting, profiling, and high resolution timers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) The management layer assigns one or more of the following functions to a clock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) event device:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)       - system global periodic tick (jiffies update)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)       - cpu local update_process_times
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)       - cpu local profiling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)       - cpu local next event interrupt (non periodic mode)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) The clock event device delegates the selection of those timer interrupt related
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) functions completely to the management layer. The clock management layer stores
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) a function pointer in the device description structure, which has to be called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) from the hardware level handler. This removes a lot of duplicated code from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) architecture specific timer interrupt handlers and hands the control over the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) clock event devices and the assignment of timer interrupt related functionality
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) to the core code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) The clock event layer API is rather small. Aside from the clock event device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) registration interface it provides functions to schedule the next event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) interrupt, clock event device notification service and support for suspend and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) resume.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) The framework adds about 700 lines of code which results in a 2KB increase of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) the kernel binary size. The conversion of i386 removes about 100 lines of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) code. The binary size decrease is in the range of 400 byte. We believe that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) increase of flexibility and the avoidance of duplicated code across
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) architectures justifies the slight increase of the binary size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) The conversion of an architecture has no functional impact, but allows to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) utilize the high resolution and dynamic tick functionalities without any change
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) to the clock event device and timer interrupt code. After the conversion the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) enabling of high resolution timers and dynamic ticks is simply provided by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) adding the kernel/time/Kconfig file to the architecture specific Kconfig and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) adding the dynamic tick specific calls to the idle routine (a total of 3 lines
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) added to the idle function and the Kconfig file)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) Figure #4 (OLS slides p.20) illustrates the transformation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) high resolution timer functionality
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) -----------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) During system boot it is not possible to use the high resolution timer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) functionality, while making it possible would be difficult and would serve no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) useful function. The initialization of the clock event device framework, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) clock source framework (GTOD) and hrtimers itself has to be done and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) appropriate clock sources and clock event devices have to be registered before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) the high resolution functionality can work. Up to the point where hrtimers are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) initialized, the system works in the usual low resolution periodic mode. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) clock source and the clock event device layers provide notification functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) which inform hrtimers about availability of new hardware. hrtimers validates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) the usability of the registered clock sources and clock event devices before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) switching to high resolution mode. This ensures also that a kernel which is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) configured for high resolution timers can run on a system which lacks the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) necessary hardware support.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) The high resolution timer code does not support SMP machines which have only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) global clock event devices. The support of such hardware would involve IPI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) calls when an interrupt happens. The overhead would be much larger than the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) benefit. This is the reason why we currently disable high resolution and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) dynamic ticks on i386 SMP systems which stop the local APIC in C3 power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) state. A workaround is available as an idea, but the problem has not been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) tackled yet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) The time ordered insertion of timers provides all the infrastructure to decide
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) whether the event device has to be reprogrammed when a timer is added. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) decision is made per timer base and synchronized across per-cpu timer bases in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) a support function. The design allows the system to utilize separate per-CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) clock event devices for the per-CPU timer bases, but currently only one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) reprogrammable clock event device per-CPU is utilized.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) When the timer interrupt happens, the next event interrupt handler is called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) from the clock event distribution code and moves expired timers from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) red-black tree to a separate double linked list and invokes the softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) handler. An additional mode field in the hrtimer structure allows the system to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) execute callback functions directly from the next event interrupt handler. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) is restricted to code which can safely be executed in the hard interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) context. This applies, for example, to the common case of a wakeup function as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) used by nanosleep. The advantage of executing the handler in the interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) context is the avoidance of up to two context switches - from the interrupted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) context to the softirq and to the task which is woken up by the expired
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) timer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) Once a system has switched to high resolution mode, the periodic tick is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) switched off. This disables the per system global periodic clock event device -
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) e.g. the PIT on i386 SMP systems.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) The periodic tick functionality is provided by an per-cpu hrtimer. The callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) function is executed in the next event interrupt context and updates jiffies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) and calls update_process_times and profiling. The implementation of the hrtimer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) based periodic tick is designed to be extended with dynamic tick functionality.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) This allows to use a single clock event device to schedule high resolution
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) timer and periodic events (jiffies tick, profiling, process accounting) on UP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) systems. This has been proved to work with the PIT on i386 and the Incrementer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) on PPC.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) The softirq for running the hrtimer queues and executing the callbacks has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) separated from the tick bound timer softirq to allow accurate delivery of high
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) resolution timer signals which are used by itimer and POSIX interval
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) timers. The execution of this softirq can still be delayed by other softirqs,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) but the overall latencies have been significantly improved by this separation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) Figure #5 (OLS slides p.22) illustrates the transformation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) dynamic ticks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) -------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) Dynamic ticks are the logical consequence of the hrtimer based periodic tick
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) replacement (sched_tick). The functionality of the sched_tick hrtimer is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) extended by three functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) - hrtimer_stop_sched_tick
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) - hrtimer_restart_sched_tick
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) - hrtimer_update_jiffies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) hrtimer_stop_sched_tick() is called when a CPU goes into idle state. The code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) evaluates the next scheduled timer event (from both hrtimers and the timer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) wheel) and in case that the next event is further away than the next tick it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) reprograms the sched_tick to this future event, to allow longer idle sleeps
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) without worthless interruption by the periodic tick. The function is also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) called when an interrupt happens during the idle period, which does not cause a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) reschedule. The call is necessary as the interrupt handler might have armed a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) new timer whose expiry time is before the time which was identified as the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) nearest event in the previous call to hrtimer_stop_sched_tick.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) hrtimer_restart_sched_tick() is called when the CPU leaves the idle state before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) it calls schedule(). hrtimer_restart_sched_tick() resumes the periodic tick,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) which is kept active until the next call to hrtimer_stop_sched_tick().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) hrtimer_update_jiffies() is called from irq_enter() when an interrupt happens
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) in the idle period to make sure that jiffies are up to date and the interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) handler has not to deal with an eventually stale jiffy value.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) The dynamic tick feature provides statistical values which are exported to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) userspace via /proc/stat and can be made available for enhanced power
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) management control.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) The implementation leaves room for further development like full tickless
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) systems, where the time slice is controlled by the scheduler, variable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) frequency profiling, and a complete removal of jiffies in the future.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) Aside the current initial submission of i386 support, the patchset has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) extended to x86_64 and ARM already. Initial (work in progress) support is also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) available for MIPS and PowerPC.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) 	  Thomas, Ingo