Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) Review Checklist for RCU Patches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) ================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) This document contains a checklist for producing and reviewing patches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) that make use of RCU.  Violating any of the rules listed below will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) result in the same sorts of problems that leaving out a locking primitive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) would cause.  This list is based on experiences reviewing such patches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) over a rather long period of time, but improvements are always welcome!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 0.	Is RCU being applied to a read-mostly situation?  If the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 	structure is updated more than about 10% of the time, then you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 	should strongly consider some other approach, unless detailed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 	performance measurements show that RCU is nonetheless the right
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 	tool for the job.  Yes, RCU does reduce read-side overhead by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 	increasing write-side overhead, which is exactly why normal uses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 	of RCU will do much more reading than updating.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 	Another exception is where performance is not an issue, and RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) 	provides a simpler implementation.  An example of this situation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 	is the dynamic NMI code in the Linux 2.6 kernel, at least on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 	architectures where NMIs are rare.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 	Yet another exception is where the low real-time latency of RCU's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 	read-side primitives is critically important.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 	One final exception is where RCU readers are used to prevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 	the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 	for lockless updates.  This does result in the mildly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 	counter-intuitive situation where rcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 	rcu_read_unlock() are used to protect updates, however, this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 	approach provides the same potential simplifications that garbage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 	collectors do.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 1.	Does the update code have proper mutual exclusion?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 	RCU does allow -readers- to run (almost) naked, but -writers- must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) 	still use some sort of mutual exclusion, such as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 	a.	locking,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 	b.	atomic operations, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 	c.	restricting updates to a single task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 	If you choose #b, be prepared to describe how you have handled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 	memory barriers on weakly ordered machines (pretty much all of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 	them -- even x86 allows later loads to be reordered to precede
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 	earlier stores), and be prepared to explain why this added
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 	complexity is worthwhile.  If you choose #c, be prepared to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 	explain how this single task does not become a major bottleneck on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) 	big multiprocessor machines (for example, if the task is updating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 	information relating to itself that other tasks can read, there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 	by definition can be no bottleneck).  Note that the definition
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 	of "large" has changed significantly:  Eight CPUs was "large"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 	in the year 2000, but a hundred CPUs was unremarkable in 2017.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 2.	Do the RCU read-side critical sections make proper use of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 	rcu_read_lock() and friends?  These primitives are needed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 	to prevent grace periods from ending prematurely, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) 	could result in data being unceremoniously freed out from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 	under your read-side code, which can greatly increase the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) 	actuarial risk of your kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 	As a rough rule of thumb, any dereference of an RCU-protected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 	pointer must be covered by rcu_read_lock(), rcu_read_lock_bh(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 	rcu_read_lock_sched(), or by the appropriate update-side lock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 	Disabling of preemption can serve as rcu_read_lock_sched(), but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) 	is less readable and prevents lockdep from detecting locking issues.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) 	Letting RCU-protected pointers "leak" out of an RCU read-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 	critical section is every bid as bad as letting them leak out
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 	from under a lock.  Unless, of course, you have arranged some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 	other means of protection, such as a lock or a reference count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 	-before- letting them out of the RCU read-side critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 3.	Does the update code tolerate concurrent accesses?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 	The whole point of RCU is to permit readers to run without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 	any locks or atomic operations.  This means that readers will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 	be running while updates are in progress.  There are a number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 	of ways to handle this concurrency, depending on the situation:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) 	a.	Use the RCU variants of the list and hlist update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) 		primitives to add, remove, and replace elements on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 		an RCU-protected list.	Alternatively, use the other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 		RCU-protected data structures that have been added to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 		the Linux kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 		This is almost always the best approach.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 	b.	Proceed as in (a) above, but also maintain per-element
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 		locks (that are acquired by both readers and writers)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 		that guard per-element state.  Of course, fields that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 		the readers refrain from accessing can be guarded by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 		some other lock acquired only by updaters, if desired.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 		This works quite well, also.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 	c.	Make updates appear atomic to readers.	For example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 		pointer updates to properly aligned fields will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 		appear atomic, as will individual atomic primitives.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 		Sequences of operations performed under a lock will -not-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 		appear to be atomic to RCU readers, nor will sequences
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 		of multiple atomic primitives.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 		This can work, but is starting to get a bit tricky.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 	d.	Carefully order the updates and the reads so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) 		readers see valid data at all phases of the update.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 		This is often more difficult than it sounds, especially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 		given modern CPUs' tendency to reorder memory references.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 		One must usually liberally sprinkle memory barriers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 		(smp_wmb(), smp_rmb(), smp_mb()) through the code,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 		making it difficult to understand and to test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 		It is usually better to group the changing data into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 		a separate structure, so that the change may be made
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 		to appear atomic by updating a pointer to reference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 		a new structure containing updated values.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 4.	Weakly ordered CPUs pose special challenges.  Almost all CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 	are weakly ordered -- even x86 CPUs allow later loads to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 	reordered to precede earlier stores.  RCU code must take all of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 	the following measures to prevent memory-corruption problems:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 	a.	Readers must maintain proper ordering of their memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 		accesses.  The rcu_dereference() primitive ensures that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 		the CPU picks up the pointer before it picks up the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 		that the pointer points to.  This really is necessary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 		on Alpha CPUs.	If you don't believe me, see:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 			http://www.openvms.compaq.com/wizard/wiz_2637.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 		The rcu_dereference() primitive is also an excellent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 		documentation aid, letting the person reading the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 		code know exactly which pointers are protected by RCU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 		Please note that compilers can also reorder code, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 		they are becoming increasingly aggressive about doing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 		just that.  The rcu_dereference() primitive therefore also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 		prevents destructive compiler optimizations.  However,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 		with a bit of devious creativity, it is possible to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 		mishandle the return value from rcu_dereference().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 		Please see rcu_dereference.txt in this directory for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 		more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 		The rcu_dereference() primitive is used by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 		various "_rcu()" list-traversal primitives, such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 		as the list_for_each_entry_rcu().  Note that it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 		perfectly legal (if redundant) for update-side code to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 		use rcu_dereference() and the "_rcu()" list-traversal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 		primitives.  This is particularly useful in code that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 		is common to readers and updaters.  However, lockdep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 		will complain if you access rcu_dereference() outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 		of an RCU read-side critical section.  See lockdep.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 		to learn what to do about this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 		Of course, neither rcu_dereference() nor the "_rcu()"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 		list-traversal primitives can substitute for a good
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 		concurrency design coordinating among multiple updaters.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) 	b.	If the list macros are being used, the list_add_tail_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) 		and list_add_rcu() primitives must be used in order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) 		to prevent weakly ordered machines from misordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 		structure initialization and pointer planting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) 		Similarly, if the hlist macros are being used, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) 		hlist_add_head_rcu() primitive is required.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) 	c.	If the list macros are being used, the list_del_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 		primitive must be used to keep list_del()'s pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) 		poisoning from inflicting toxic effects on concurrent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 		readers.  Similarly, if the hlist macros are being used,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 		the hlist_del_rcu() primitive is required.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) 		The list_replace_rcu() and hlist_replace_rcu() primitives
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) 		may be used to replace an old structure with a new one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) 		in their respective types of RCU-protected lists.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) 	d.	Rules similar to (4b) and (4c) apply to the "hlist_nulls"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) 		type of RCU-protected linked lists.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) 	e.	Updates must ensure that initialization of a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) 		structure happens before pointers to that structure are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) 		publicized.  Use the rcu_assign_pointer() primitive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) 		when publicizing a pointer to a structure that can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 		be traversed by an RCU read-side critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) 5.	If call_rcu() or call_srcu() is used, the callback function will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) 	be called from softirq context.  In particular, it cannot block.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) 6.	Since synchronize_rcu() can block, it cannot be called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) 	from any sort of irq context.  The same rule applies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) 	for synchronize_srcu(), synchronize_rcu_expedited(), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) 	synchronize_srcu_expedited().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) 	The expedited forms of these primitives have the same semantics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) 	as the non-expedited forms, but expediting is both expensive and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 	(with the exception of synchronize_srcu_expedited()) unfriendly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) 	to real-time workloads.  Use of the expedited primitives should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) 	be restricted to rare configuration-change operations that would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) 	not normally be undertaken while a real-time workload is running.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) 	However, real-time workloads can use rcupdate.rcu_normal kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) 	boot parameter to completely disable expedited grace periods,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) 	though this might have performance implications.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) 	In particular, if you find yourself invoking one of the expedited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) 	primitives repeatedly in a loop, please do everyone a favor:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) 	Restructure your code so that it batches the updates, allowing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) 	a single non-expedited primitive to cover the entire batch.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) 	This will very likely be faster than the loop containing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) 	expedited primitive, and will be much much easier on the rest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) 	of the system, especially to real-time workloads running on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) 	the rest of the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) 7.	As of v4.20, a given kernel implements only one RCU flavor,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) 	which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 	If the updater uses call_rcu() or synchronize_rcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) 	then the corresponding readers my use rcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) 	rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) 	or any pair of primitives that disables and re-enables preemption,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) 	for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) 	If the updater uses synchronize_srcu() or call_srcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) 	then the corresponding readers must use srcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) 	srcu_read_unlock(), and with the same srcu_struct.  The rules for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) 	the expedited primitives are the same as for their non-expedited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) 	counterparts.  Mixing things up will result in confusion and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) 	broken kernels, and has even resulted in an exploitable security
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) 	issue.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) 	One exception to this rule: rcu_read_lock() and rcu_read_unlock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) 	may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) 	in cases where local bottom halves are already known to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) 	disabled, for example, in irq or softirq context.  Commenting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) 	such cases is a must, of course!  And the jury is still out on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) 	whether the increased speed is worth it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 8.	Although synchronize_rcu() is slower than is call_rcu(), it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) 	usually results in simpler code.  So, unless update performance is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 	critically important, the updaters cannot block, or the latency of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) 	synchronize_rcu() is visible from userspace, synchronize_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) 	should be used in preference to call_rcu().  Furthermore,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) 	kfree_rcu() usually results in even simpler code than does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) 	synchronize_rcu() without synchronize_rcu()'s multi-millisecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 	latency.  So please take advantage of kfree_rcu()'s "fire and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 	forget" memory-freeing capabilities where it applies.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) 	An especially important property of the synchronize_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) 	primitive is that it automatically self-limits: if grace periods
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) 	are delayed for whatever reason, then the synchronize_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) 	primitive will correspondingly delay updates.  In contrast,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) 	code using call_rcu() should explicitly limit update rate in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 	cases where grace periods are delayed, as failing to do so can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 	result in excessive realtime latencies or even OOM conditions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 	Ways of gaining this self-limiting property when using call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) 	include:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) 	a.	Keeping a count of the number of data-structure elements
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) 		used by the RCU-protected data structure, including
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) 		those waiting for a grace period to elapse.  Enforce a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) 		limit on this number, stalling updates as needed to allow
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) 		previously deferred frees to complete.	Alternatively,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) 		limit only the number awaiting deferred free rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) 		the total number of elements.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 		One way to stall the updates is to acquire the update-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) 		mutex.	(Don't try this with a spinlock -- other CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) 		spinning on the lock could prevent the grace period
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) 		from ever ending.)  Another way to stall the updates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) 		is for the updates to use a wrapper function around
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) 		the memory allocator, so that this wrapper function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) 		simulates OOM when there is too much memory awaiting an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) 		RCU grace period.  There are of course many other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) 		variations on this theme.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) 	b.	Limiting update rate.  For example, if updates occur only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) 		once per hour, then no explicit rate limiting is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) 		required, unless your system is already badly broken.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) 		Older versions of the dcache subsystem take this approach,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) 		guarding updates with a global lock, limiting their rate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) 	c.	Trusted update -- if updates can only be done manually by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 		superuser or some other trusted user, then it might not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) 		be necessary to automatically limit them.  The theory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) 		here is that superuser already has lots of ways to crash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) 		the machine.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) 	d.	Periodically invoke synchronize_rcu(), permitting a limited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) 		number of updates per grace period.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) 	The same cautions apply to call_srcu() and kfree_rcu().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) 	Note that although these primitives do take action to avoid memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) 	exhaustion when any given CPU has too many callbacks, a determined
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 	user could still exhaust memory.  This is especially the case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) 	if a system with a large number of CPUs has been configured to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) 	offload all of its RCU callbacks onto a single CPU, or if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) 	system has relatively little free memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) 9.	All RCU list-traversal primitives, which include
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) 	rcu_dereference(), list_for_each_entry_rcu(), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) 	list_for_each_safe_rcu(), must be either within an RCU read-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) 	critical section or must be protected by appropriate update-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) 	locks.	RCU read-side critical sections are delimited by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) 	rcu_read_lock() and rcu_read_unlock(), or by similar primitives
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) 	such as rcu_read_lock_bh() and rcu_read_unlock_bh(), in which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) 	case the matching rcu_dereference() primitive must be used in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) 	order to keep lockdep happy, in this case, rcu_dereference_bh().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) 	The reason that it is permissible to use RCU list-traversal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 	primitives when the update-side lock is held is that doing so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) 	can be quite helpful in reducing code bloat when common code is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) 	shared between readers and updaters.  Additional primitives
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) 	are provided for this case, as discussed in lockdep.txt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) 10.	Conversely, if you are in an RCU read-side critical section,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) 	and you don't hold the appropriate update-side lock, you -must-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) 	use the "_rcu()" variants of the list macros.  Failing to do so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) 	will break Alpha, cause aggressive compilers to generate bad code,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) 	and confuse people trying to read your code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) 11.	Any lock acquired by an RCU callback must be acquired elsewhere
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) 	with softirq disabled, e.g., via spin_lock_irqsave(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) 	spin_lock_bh(), etc.  Failing to disable softirq on a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) 	acquisition of that lock will result in deadlock as soon as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) 	the RCU softirq handler happens to run your RCU callback while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 	interrupting that acquisition's critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 12.	RCU callbacks can be and are executed in parallel.  In many cases,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 	the callback code simply wrappers around kfree(), so that this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) 	is not an issue (or, more accurately, to the extent that it is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) 	an issue, the memory-allocator locking handles it).  However,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) 	if the callbacks do manipulate a shared data structure, they
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) 	must use whatever locking or other synchronization is required
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 	to safely access and/or modify that data structure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) 	Do not assume that RCU callbacks will be executed on the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) 	CPU that executed the corresponding call_rcu() or call_srcu().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) 	For example, if a given CPU goes offline while having an RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) 	callback pending, then that RCU callback will execute on some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) 	surviving CPU.	(If this was not the case, a self-spawning RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) 	callback would prevent the victim CPU from ever going offline.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) 	Furthermore, CPUs designated by rcu_nocbs= might well -always-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) 	have their RCU callbacks executed on some other CPUs, in fact,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) 	for some  real-time workloads, this is the whole point of using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) 	the rcu_nocbs= kernel boot parameter.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 13.	Unlike other forms of RCU, it -is- permissible to block in an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) 	SRCU read-side critical section (demarked by srcu_read_lock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) 	and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) 	Please note that if you don't need to sleep in read-side critical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) 	sections, you should be using RCU rather than SRCU, because RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) 	is almost always faster and easier to use than is SRCU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) 	Also unlike other forms of RCU, explicit initialization and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) 	cleanup is required either at build time via DEFINE_SRCU()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) 	or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) 	and cleanup_srcu_struct().  These last two are passed a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 	"struct srcu_struct" that defines the scope of a given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 	SRCU domain.  Once initialized, the srcu_struct is passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) 	to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) 	synchronize_srcu_expedited(), and call_srcu().	A given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 	synchronize_srcu() waits only for SRCU read-side critical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) 	sections governed by srcu_read_lock() and srcu_read_unlock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 	calls that have been passed the same srcu_struct.  This property
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) 	is what makes sleeping read-side critical sections tolerable --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) 	a given subsystem delays only its own updates, not those of other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) 	subsystems using SRCU.	Therefore, SRCU is less prone to OOM the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) 	system than RCU would be if RCU's read-side critical sections
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) 	were permitted to sleep.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) 	The ability to sleep in read-side critical sections does not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) 	come for free.	First, corresponding srcu_read_lock() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) 	srcu_read_unlock() calls must be passed the same srcu_struct.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) 	Second, grace-period-detection overhead is amortized only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) 	over those updates sharing a given srcu_struct, rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) 	being globally amortized as they are for other forms of RCU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) 	Therefore, SRCU should be used in preference to rw_semaphore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) 	only in extremely read-intensive situations, or in situations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) 	requiring SRCU's read-side deadlock immunity or low read-side
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) 	realtime latency.  You should also consider percpu_rw_semaphore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) 	when you need lightweight readers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) 	SRCU's expedited primitive (synchronize_srcu_expedited())
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) 	never sends IPIs to other CPUs, so it is easier on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) 	real-time workloads than is synchronize_rcu_expedited().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) 	Note that rcu_assign_pointer() relates to SRCU just as it does to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) 	other forms of RCU, but instead of rcu_dereference() you should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) 	use srcu_dereference() in order to avoid lockdep splats.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) 14.	The whole point of call_rcu(), synchronize_rcu(), and friends
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) 	is to wait until all pre-existing readers have finished before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) 	carrying out some otherwise-destructive operation.  It is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) 	therefore critically important to -first- remove any path
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) 	that readers can follow that could be affected by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) 	destructive operation, and -only- -then- invoke call_rcu(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) 	synchronize_rcu(), or friends.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) 	Because these primitives only wait for pre-existing readers, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) 	is the caller's responsibility to guarantee that any subsequent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) 	readers will execute safely.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) 15.	The various RCU read-side primitives do -not- necessarily contain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) 	memory barriers.  You should therefore plan for the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) 	and the compiler to freely reorder code into and out of RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) 	read-side critical sections.  It is the responsibility of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) 	RCU update-side primitives to deal with this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) 	For SRCU readers, you can use smp_mb__after_srcu_read_unlock()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) 	immediately after an srcu_read_unlock() to get a full barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) 16.	Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) 	__rcu sparse checks to validate your RCU code.	These can help
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) 	find problems as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) 	CONFIG_PROVE_LOCKING:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) 		check that accesses to RCU-protected data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) 		structures are carried out under the proper RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) 		read-side critical section, while holding the right
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) 		combination of locks, or whatever other conditions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) 		are appropriate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) 	CONFIG_DEBUG_OBJECTS_RCU_HEAD:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) 		check that you don't pass the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) 		same object to call_rcu() (or friends) before an RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) 		grace period has elapsed since the last time that you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) 		passed that same object to call_rcu() (or friends).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) 	__rcu sparse checks:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) 		tag the pointer to the RCU-protected data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) 		structure with __rcu, and sparse will warn you if you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) 		access that pointer without the services of one of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) 		variants of rcu_dereference().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) 	These debugging aids can help you find problems that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) 	otherwise extremely difficult to spot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) 17.	If you register a callback using call_rcu() or call_srcu(), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) 	pass in a function defined within a loadable module, then it in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) 	necessary to wait for all pending callbacks to be invoked after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) 	the last invocation and before unloading that module.  Note that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) 	it is absolutely -not- sufficient to wait for a grace period!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) 	The current (say) synchronize_rcu() implementation is -not-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) 	guaranteed to wait for callbacks registered on other CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) 	Or even on the current CPU if that CPU recently went offline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) 	and came back online.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) 	You instead need to use one of the barrier functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) 	-	call_rcu() -> rcu_barrier()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) 	-	call_srcu() -> srcu_barrier()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) 	However, these barrier functions are absolutely -not- guaranteed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) 	to wait for a grace period.  In fact, if there are no call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) 	callbacks waiting anywhere in the system, rcu_barrier() is within
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) 	its rights to return immediately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) 	So if you need to wait for both an RCU grace period and for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) 	all pre-existing call_rcu() callbacks, you will need to execute
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) 	both rcu_barrier() and synchronize_rcu(), if necessary, using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) 	something like workqueues to to execute them concurrently.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) 	See rcubarrier.txt for more information.