^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) Review Checklist for RCU Patches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) This document contains a checklist for producing and reviewing patches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) that make use of RCU. Violating any of the rules listed below will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) result in the same sorts of problems that leaving out a locking primitive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) would cause. This list is based on experiences reviewing such patches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) over a rather long period of time, but improvements are always welcome!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) 0. Is RCU being applied to a read-mostly situation? If the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) structure is updated more than about 10% of the time, then you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) should strongly consider some other approach, unless detailed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) performance measurements show that RCU is nonetheless the right
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) tool for the job. Yes, RCU does reduce read-side overhead by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) increasing write-side overhead, which is exactly why normal uses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) of RCU will do much more reading than updating.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) Another exception is where performance is not an issue, and RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) provides a simpler implementation. An example of this situation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) is the dynamic NMI code in the Linux 2.6 kernel, at least on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) architectures where NMIs are rare.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) Yet another exception is where the low real-time latency of RCU's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) read-side primitives is critically important.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) One final exception is where RCU readers are used to prevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) for lockless updates. This does result in the mildly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) counter-intuitive situation where rcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) rcu_read_unlock() are used to protect updates, however, this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) approach provides the same potential simplifications that garbage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) collectors do.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) 1. Does the update code have proper mutual exclusion?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) RCU does allow -readers- to run (almost) naked, but -writers- must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) still use some sort of mutual exclusion, such as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) a. locking,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) b. atomic operations, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) c. restricting updates to a single task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) If you choose #b, be prepared to describe how you have handled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) memory barriers on weakly ordered machines (pretty much all of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) them -- even x86 allows later loads to be reordered to precede
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) earlier stores), and be prepared to explain why this added
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) complexity is worthwhile. If you choose #c, be prepared to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) explain how this single task does not become a major bottleneck on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) big multiprocessor machines (for example, if the task is updating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) information relating to itself that other tasks can read, there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) by definition can be no bottleneck). Note that the definition
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) of "large" has changed significantly: Eight CPUs was "large"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) in the year 2000, but a hundred CPUs was unremarkable in 2017.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) 2. Do the RCU read-side critical sections make proper use of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) rcu_read_lock() and friends? These primitives are needed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) to prevent grace periods from ending prematurely, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) could result in data being unceremoniously freed out from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) under your read-side code, which can greatly increase the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) actuarial risk of your kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) As a rough rule of thumb, any dereference of an RCU-protected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) rcu_read_lock_sched(), or by the appropriate update-side lock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) Disabling of preemption can serve as rcu_read_lock_sched(), but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) is less readable and prevents lockdep from detecting locking issues.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) Letting RCU-protected pointers "leak" out of an RCU read-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) critical section is every bid as bad as letting them leak out
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) from under a lock. Unless, of course, you have arranged some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) other means of protection, such as a lock or a reference count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) -before- letting them out of the RCU read-side critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) 3. Does the update code tolerate concurrent accesses?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) The whole point of RCU is to permit readers to run without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) any locks or atomic operations. This means that readers will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) be running while updates are in progress. There are a number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) of ways to handle this concurrency, depending on the situation:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) a. Use the RCU variants of the list and hlist update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) primitives to add, remove, and replace elements on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) an RCU-protected list. Alternatively, use the other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) RCU-protected data structures that have been added to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) the Linux kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) This is almost always the best approach.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) b. Proceed as in (a) above, but also maintain per-element
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) locks (that are acquired by both readers and writers)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) that guard per-element state. Of course, fields that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) the readers refrain from accessing can be guarded by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) some other lock acquired only by updaters, if desired.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) This works quite well, also.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) c. Make updates appear atomic to readers. For example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) pointer updates to properly aligned fields will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) appear atomic, as will individual atomic primitives.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) Sequences of operations performed under a lock will -not-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) appear to be atomic to RCU readers, nor will sequences
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) of multiple atomic primitives.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) This can work, but is starting to get a bit tricky.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) d. Carefully order the updates and the reads so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) readers see valid data at all phases of the update.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) This is often more difficult than it sounds, especially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) given modern CPUs' tendency to reorder memory references.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) One must usually liberally sprinkle memory barriers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) (smp_wmb(), smp_rmb(), smp_mb()) through the code,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) making it difficult to understand and to test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) It is usually better to group the changing data into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) a separate structure, so that the change may be made
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) to appear atomic by updating a pointer to reference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) a new structure containing updated values.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 4. Weakly ordered CPUs pose special challenges. Almost all CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) are weakly ordered -- even x86 CPUs allow later loads to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) reordered to precede earlier stores. RCU code must take all of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) the following measures to prevent memory-corruption problems:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) a. Readers must maintain proper ordering of their memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) accesses. The rcu_dereference() primitive ensures that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) the CPU picks up the pointer before it picks up the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) that the pointer points to. This really is necessary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) on Alpha CPUs. If you don't believe me, see:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) http://www.openvms.compaq.com/wizard/wiz_2637.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) The rcu_dereference() primitive is also an excellent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) documentation aid, letting the person reading the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) code know exactly which pointers are protected by RCU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) Please note that compilers can also reorder code, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) they are becoming increasingly aggressive about doing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) just that. The rcu_dereference() primitive therefore also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) prevents destructive compiler optimizations. However,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) with a bit of devious creativity, it is possible to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) mishandle the return value from rcu_dereference().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) Please see rcu_dereference.txt in this directory for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) The rcu_dereference() primitive is used by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) various "_rcu()" list-traversal primitives, such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) as the list_for_each_entry_rcu(). Note that it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) perfectly legal (if redundant) for update-side code to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) use rcu_dereference() and the "_rcu()" list-traversal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) primitives. This is particularly useful in code that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) is common to readers and updaters. However, lockdep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) will complain if you access rcu_dereference() outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) of an RCU read-side critical section. See lockdep.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) to learn what to do about this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) Of course, neither rcu_dereference() nor the "_rcu()"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) list-traversal primitives can substitute for a good
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) concurrency design coordinating among multiple updaters.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) b. If the list macros are being used, the list_add_tail_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) and list_add_rcu() primitives must be used in order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) to prevent weakly ordered machines from misordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) structure initialization and pointer planting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) Similarly, if the hlist macros are being used, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) hlist_add_head_rcu() primitive is required.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) c. If the list macros are being used, the list_del_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) primitive must be used to keep list_del()'s pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) poisoning from inflicting toxic effects on concurrent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) readers. Similarly, if the hlist macros are being used,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) the hlist_del_rcu() primitive is required.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) The list_replace_rcu() and hlist_replace_rcu() primitives
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) may be used to replace an old structure with a new one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) in their respective types of RCU-protected lists.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) d. Rules similar to (4b) and (4c) apply to the "hlist_nulls"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) type of RCU-protected linked lists.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) e. Updates must ensure that initialization of a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) structure happens before pointers to that structure are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) publicized. Use the rcu_assign_pointer() primitive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) when publicizing a pointer to a structure that can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) be traversed by an RCU read-side critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) 5. If call_rcu() or call_srcu() is used, the callback function will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) be called from softirq context. In particular, it cannot block.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) 6. Since synchronize_rcu() can block, it cannot be called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) from any sort of irq context. The same rule applies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) for synchronize_srcu(), synchronize_rcu_expedited(), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) synchronize_srcu_expedited().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) The expedited forms of these primitives have the same semantics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) as the non-expedited forms, but expediting is both expensive and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) (with the exception of synchronize_srcu_expedited()) unfriendly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) to real-time workloads. Use of the expedited primitives should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) be restricted to rare configuration-change operations that would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) not normally be undertaken while a real-time workload is running.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) However, real-time workloads can use rcupdate.rcu_normal kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) boot parameter to completely disable expedited grace periods,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) though this might have performance implications.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) In particular, if you find yourself invoking one of the expedited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) primitives repeatedly in a loop, please do everyone a favor:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) Restructure your code so that it batches the updates, allowing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) a single non-expedited primitive to cover the entire batch.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) This will very likely be faster than the loop containing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) expedited primitive, and will be much much easier on the rest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) of the system, especially to real-time workloads running on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) the rest of the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) 7. As of v4.20, a given kernel implements only one RCU flavor,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) If the updater uses call_rcu() or synchronize_rcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) then the corresponding readers my use rcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) or any pair of primitives that disables and re-enables preemption,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) If the updater uses synchronize_srcu() or call_srcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) then the corresponding readers must use srcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) srcu_read_unlock(), and with the same srcu_struct. The rules for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) the expedited primitives are the same as for their non-expedited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) counterparts. Mixing things up will result in confusion and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) broken kernels, and has even resulted in an exploitable security
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) issue.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) One exception to this rule: rcu_read_lock() and rcu_read_unlock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) in cases where local bottom halves are already known to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) disabled, for example, in irq or softirq context. Commenting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) such cases is a must, of course! And the jury is still out on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) whether the increased speed is worth it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 8. Although synchronize_rcu() is slower than is call_rcu(), it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) usually results in simpler code. So, unless update performance is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) critically important, the updaters cannot block, or the latency of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) synchronize_rcu() is visible from userspace, synchronize_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) should be used in preference to call_rcu(). Furthermore,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) kfree_rcu() usually results in even simpler code than does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) synchronize_rcu() without synchronize_rcu()'s multi-millisecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) latency. So please take advantage of kfree_rcu()'s "fire and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) forget" memory-freeing capabilities where it applies.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) An especially important property of the synchronize_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) primitive is that it automatically self-limits: if grace periods
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) are delayed for whatever reason, then the synchronize_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) primitive will correspondingly delay updates. In contrast,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) code using call_rcu() should explicitly limit update rate in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) cases where grace periods are delayed, as failing to do so can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) result in excessive realtime latencies or even OOM conditions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) Ways of gaining this self-limiting property when using call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) include:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) a. Keeping a count of the number of data-structure elements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) used by the RCU-protected data structure, including
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) those waiting for a grace period to elapse. Enforce a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) limit on this number, stalling updates as needed to allow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) previously deferred frees to complete. Alternatively,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) limit only the number awaiting deferred free rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) the total number of elements.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) One way to stall the updates is to acquire the update-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) mutex. (Don't try this with a spinlock -- other CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) spinning on the lock could prevent the grace period
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) from ever ending.) Another way to stall the updates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) is for the updates to use a wrapper function around
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) the memory allocator, so that this wrapper function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) simulates OOM when there is too much memory awaiting an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) RCU grace period. There are of course many other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) variations on this theme.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) b. Limiting update rate. For example, if updates occur only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) once per hour, then no explicit rate limiting is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) required, unless your system is already badly broken.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) Older versions of the dcache subsystem take this approach,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) guarding updates with a global lock, limiting their rate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) c. Trusted update -- if updates can only be done manually by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) superuser or some other trusted user, then it might not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) be necessary to automatically limit them. The theory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) here is that superuser already has lots of ways to crash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) the machine.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) d. Periodically invoke synchronize_rcu(), permitting a limited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) number of updates per grace period.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) The same cautions apply to call_srcu() and kfree_rcu().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) Note that although these primitives do take action to avoid memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) exhaustion when any given CPU has too many callbacks, a determined
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) user could still exhaust memory. This is especially the case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) if a system with a large number of CPUs has been configured to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) offload all of its RCU callbacks onto a single CPU, or if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) system has relatively little free memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) 9. All RCU list-traversal primitives, which include
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) rcu_dereference(), list_for_each_entry_rcu(), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) list_for_each_safe_rcu(), must be either within an RCU read-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) critical section or must be protected by appropriate update-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) locks. RCU read-side critical sections are delimited by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) rcu_read_lock() and rcu_read_unlock(), or by similar primitives
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) such as rcu_read_lock_bh() and rcu_read_unlock_bh(), in which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) case the matching rcu_dereference() primitive must be used in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) order to keep lockdep happy, in this case, rcu_dereference_bh().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) The reason that it is permissible to use RCU list-traversal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) primitives when the update-side lock is held is that doing so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) can be quite helpful in reducing code bloat when common code is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) shared between readers and updaters. Additional primitives
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) are provided for this case, as discussed in lockdep.txt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) 10. Conversely, if you are in an RCU read-side critical section,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) and you don't hold the appropriate update-side lock, you -must-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) use the "_rcu()" variants of the list macros. Failing to do so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) will break Alpha, cause aggressive compilers to generate bad code,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) and confuse people trying to read your code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) 11. Any lock acquired by an RCU callback must be acquired elsewhere
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) with softirq disabled, e.g., via spin_lock_irqsave(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) spin_lock_bh(), etc. Failing to disable softirq on a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) acquisition of that lock will result in deadlock as soon as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) the RCU softirq handler happens to run your RCU callback while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) interrupting that acquisition's critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 12. RCU callbacks can be and are executed in parallel. In many cases,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) the callback code simply wrappers around kfree(), so that this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) is not an issue (or, more accurately, to the extent that it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) an issue, the memory-allocator locking handles it). However,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) if the callbacks do manipulate a shared data structure, they
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) must use whatever locking or other synchronization is required
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) to safely access and/or modify that data structure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) Do not assume that RCU callbacks will be executed on the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) CPU that executed the corresponding call_rcu() or call_srcu().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) For example, if a given CPU goes offline while having an RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) callback pending, then that RCU callback will execute on some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) surviving CPU. (If this was not the case, a self-spawning RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) callback would prevent the victim CPU from ever going offline.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) Furthermore, CPUs designated by rcu_nocbs= might well -always-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) have their RCU callbacks executed on some other CPUs, in fact,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) for some real-time workloads, this is the whole point of using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) the rcu_nocbs= kernel boot parameter.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 13. Unlike other forms of RCU, it -is- permissible to block in an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) SRCU read-side critical section (demarked by srcu_read_lock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) Please note that if you don't need to sleep in read-side critical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) sections, you should be using RCU rather than SRCU, because RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) is almost always faster and easier to use than is SRCU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) Also unlike other forms of RCU, explicit initialization and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) cleanup is required either at build time via DEFINE_SRCU()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) and cleanup_srcu_struct(). These last two are passed a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) "struct srcu_struct" that defines the scope of a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) SRCU domain. Once initialized, the srcu_struct is passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) synchronize_srcu_expedited(), and call_srcu(). A given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) synchronize_srcu() waits only for SRCU read-side critical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) sections governed by srcu_read_lock() and srcu_read_unlock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) calls that have been passed the same srcu_struct. This property
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) is what makes sleeping read-side critical sections tolerable --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) a given subsystem delays only its own updates, not those of other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) subsystems using SRCU. Therefore, SRCU is less prone to OOM the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) system than RCU would be if RCU's read-side critical sections
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) were permitted to sleep.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) The ability to sleep in read-side critical sections does not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) come for free. First, corresponding srcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) srcu_read_unlock() calls must be passed the same srcu_struct.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) Second, grace-period-detection overhead is amortized only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) over those updates sharing a given srcu_struct, rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) being globally amortized as they are for other forms of RCU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) Therefore, SRCU should be used in preference to rw_semaphore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) only in extremely read-intensive situations, or in situations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) requiring SRCU's read-side deadlock immunity or low read-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) realtime latency. You should also consider percpu_rw_semaphore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) when you need lightweight readers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) SRCU's expedited primitive (synchronize_srcu_expedited())
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) never sends IPIs to other CPUs, so it is easier on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) real-time workloads than is synchronize_rcu_expedited().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) Note that rcu_assign_pointer() relates to SRCU just as it does to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) other forms of RCU, but instead of rcu_dereference() you should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) use srcu_dereference() in order to avoid lockdep splats.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) 14. The whole point of call_rcu(), synchronize_rcu(), and friends
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) is to wait until all pre-existing readers have finished before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) carrying out some otherwise-destructive operation. It is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) therefore critically important to -first- remove any path
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) that readers can follow that could be affected by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) destructive operation, and -only- -then- invoke call_rcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) synchronize_rcu(), or friends.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) Because these primitives only wait for pre-existing readers, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) is the caller's responsibility to guarantee that any subsequent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) readers will execute safely.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) 15. The various RCU read-side primitives do -not- necessarily contain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) memory barriers. You should therefore plan for the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) and the compiler to freely reorder code into and out of RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) read-side critical sections. It is the responsibility of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) RCU update-side primitives to deal with this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) For SRCU readers, you can use smp_mb__after_srcu_read_unlock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) immediately after an srcu_read_unlock() to get a full barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) 16. Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) __rcu sparse checks to validate your RCU code. These can help
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) find problems as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) CONFIG_PROVE_LOCKING:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) check that accesses to RCU-protected data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) structures are carried out under the proper RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) read-side critical section, while holding the right
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) combination of locks, or whatever other conditions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) are appropriate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) CONFIG_DEBUG_OBJECTS_RCU_HEAD:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) check that you don't pass the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) same object to call_rcu() (or friends) before an RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) grace period has elapsed since the last time that you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) passed that same object to call_rcu() (or friends).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) __rcu sparse checks:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) tag the pointer to the RCU-protected data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) structure with __rcu, and sparse will warn you if you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) access that pointer without the services of one of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) variants of rcu_dereference().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) These debugging aids can help you find problems that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) otherwise extremely difficult to spot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) 17. If you register a callback using call_rcu() or call_srcu(), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) pass in a function defined within a loadable module, then it in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) necessary to wait for all pending callbacks to be invoked after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) the last invocation and before unloading that module. Note that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) it is absolutely -not- sufficient to wait for a grace period!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) The current (say) synchronize_rcu() implementation is -not-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) guaranteed to wait for callbacks registered on other CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) Or even on the current CPU if that CPU recently went offline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) and came back online.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) You instead need to use one of the barrier functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) - call_rcu() -> rcu_barrier()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) - call_srcu() -> srcu_barrier()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) However, these barrier functions are absolutely -not- guaranteed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) to wait for a grace period. In fact, if there are no call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) callbacks waiting anywhere in the system, rcu_barrier() is within
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) its rights to return immediately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) So if you need to wait for both an RCU grace period and for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) all pre-existing call_rcu() callbacks, you will need to execute
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) both rcu_barrier() and synchronize_rcu(), if necessary, using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) something like workqueues to to execute them concurrently.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) See rcubarrier.txt for more information.