Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    1) 			 ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    2) 			 LINUX KERNEL MEMORY BARRIERS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    3) 			 ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    5) By: David Howells <dhowells@redhat.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    6)     Paul E. McKenney <paulmck@linux.ibm.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    7)     Will Deacon <will.deacon@arm.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    8)     Peter Zijlstra <peterz@infradead.org>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   10) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   11) DISCLAIMER
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   12) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   13) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   14) This document is not a specification; it is intentionally (for the sake of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   15) brevity) and unintentionally (due to being human) incomplete. This document is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   16) meant as a guide to using the various memory barriers provided by Linux, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   17) in case of any doubt (and there are many) please ask.  Some doubts may be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   18) resolved by referring to the formal memory consistency model and related
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   19) documentation at tools/memory-model/.  Nevertheless, even this memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   20) model should be viewed as the collective opinion of its maintainers rather
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   21) than as an infallible oracle.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   23) To repeat, this document is not a specification of what Linux expects from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   24) hardware.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   26) The purpose of this document is twofold:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   27) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   28)  (1) to specify the minimum functionality that one can rely on for any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   29)      particular barrier, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   30) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   31)  (2) to provide a guide as to how to use the barriers that are available.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   33) Note that an architecture can provide more than the minimum requirement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   34) for any particular barrier, but if the architecture provides less than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   35) that, that architecture is incorrect.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   36) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   37) Note also that it is possible that a barrier may be a no-op for an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   38) architecture because the way that arch works renders an explicit barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   39) unnecessary in that case.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   41) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   42) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   43) CONTENTS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   44) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   46)  (*) Abstract memory access model.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   48)      - Device operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   49)      - Guarantees.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   50) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   51)  (*) What are memory barriers?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   53)      - Varieties of memory barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   54)      - What may not be assumed about memory barriers?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   55)      - Data dependency barriers (historical).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   56)      - Control dependencies.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   57)      - SMP barrier pairing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   58)      - Examples of memory barrier sequences.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   59)      - Read memory barriers vs load speculation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   60)      - Multicopy atomicity.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   62)  (*) Explicit kernel barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   64)      - Compiler barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   65)      - CPU memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   67)  (*) Implicit kernel memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   68) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   69)      - Lock acquisition functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   70)      - Interrupt disabling functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   71)      - Sleep and wake-up functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   72)      - Miscellaneous functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   74)  (*) Inter-CPU acquiring barrier effects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   75) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   76)      - Acquires vs memory accesses.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   77) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   78)  (*) Where are memory barriers needed?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   80)      - Interprocessor interaction.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   81)      - Atomic operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   82)      - Accessing devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   83)      - Interrupts.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   85)  (*) Kernel I/O barrier effects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   86) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   87)  (*) Assumed minimum execution ordering model.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   88) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   89)  (*) The effects of the cpu cache.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   91)      - Cache coherency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   92)      - Cache coherency vs DMA.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   93)      - Cache coherency vs MMIO.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   94) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   95)  (*) The things CPUs get up to.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   97)      - And then there's the Alpha.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   98)      - Virtual Machine Guests.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  100)  (*) Example uses.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  102)      - Circular buffers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  104)  (*) References.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  105) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  107) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  108) ABSTRACT MEMORY ACCESS MODEL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  109) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  111) Consider the following abstract model of the system:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  112) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  113) 		            :                :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  114) 		            :                :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  115) 		            :                :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  116) 		+-------+   :   +--------+   :   +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  117) 		|       |   :   |        |   :   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  118) 		|       |   :   |        |   :   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  119) 		| CPU 1 |<----->| Memory |<----->| CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  120) 		|       |   :   |        |   :   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  121) 		|       |   :   |        |   :   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  122) 		+-------+   :   +--------+   :   +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  123) 		    ^       :       ^        :       ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  124) 		    |       :       |        :       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  125) 		    |       :       |        :       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  126) 		    |       :       v        :       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  127) 		    |       :   +--------+   :       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  128) 		    |       :   |        |   :       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  129) 		    |       :   |        |   :       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  130) 		    +---------->| Device |<----------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  131) 		            :   |        |   :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  132) 		            :   |        |   :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  133) 		            :   +--------+   :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  134) 		            :                :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  136) Each CPU executes a program that generates memory access operations.  In the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  137) abstract CPU, memory operation ordering is very relaxed, and a CPU may actually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  138) perform the memory operations in any order it likes, provided program causality
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  139) appears to be maintained.  Similarly, the compiler may also arrange the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  140) instructions it emits in any order it likes, provided it doesn't affect the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  141) apparent operation of the program.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  142) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  143) So in the above diagram, the effects of the memory operations performed by a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  144) CPU are perceived by the rest of the system as the operations cross the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  145) interface between the CPU and rest of the system (the dotted lines).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  146) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  147) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  148) For example, consider the following sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  150) 	CPU 1		CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  151) 	===============	===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  152) 	{ A == 1; B == 2 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  153) 	A = 3;		x = B;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  154) 	B = 4;		y = A;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  155) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  156) The set of accesses as seen by the memory system in the middle can be arranged
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  157) in 24 different combinations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  159) 	STORE A=3,	STORE B=4,	y=LOAD A->3,	x=LOAD B->4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  160) 	STORE A=3,	STORE B=4,	x=LOAD B->4,	y=LOAD A->3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  161) 	STORE A=3,	y=LOAD A->3,	STORE B=4,	x=LOAD B->4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  162) 	STORE A=3,	y=LOAD A->3,	x=LOAD B->2,	STORE B=4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  163) 	STORE A=3,	x=LOAD B->2,	STORE B=4,	y=LOAD A->3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  164) 	STORE A=3,	x=LOAD B->2,	y=LOAD A->3,	STORE B=4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  165) 	STORE B=4,	STORE A=3,	y=LOAD A->3,	x=LOAD B->4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  166) 	STORE B=4, ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  167) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  168) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  169) and can thus result in four different combinations of values:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  170) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  171) 	x == 2, y == 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  172) 	x == 2, y == 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  173) 	x == 4, y == 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  174) 	x == 4, y == 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  176) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  177) Furthermore, the stores committed by a CPU to the memory system may not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  178) perceived by the loads made by another CPU in the same order as the stores were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  179) committed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  180) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  181) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  182) As a further example, consider this sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  183) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  184) 	CPU 1		CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  185) 	===============	===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  186) 	{ A == 1, B == 2, C == 3, P == &A, Q == &C }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  187) 	B = 4;		Q = P;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  188) 	P = &B;		D = *Q;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  189) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  190) There is an obvious data dependency here, as the value loaded into D depends on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  191) the address retrieved from P by CPU 2.  At the end of the sequence, any of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  192) following results are possible:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  193) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  194) 	(Q == &A) and (D == 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  195) 	(Q == &B) and (D == 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  196) 	(Q == &B) and (D == 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  197) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  198) Note that CPU 2 will never try and load C into D because the CPU will load P
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  199) into Q before issuing the load of *Q.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  200) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  201) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  202) DEVICE OPERATIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  203) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  204) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  205) Some devices present their control interfaces as collections of memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  206) locations, but the order in which the control registers are accessed is very
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  207) important.  For instance, imagine an ethernet card with a set of internal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  208) registers that are accessed through an address port register (A) and a data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  209) port register (D).  To read internal register 5, the following code might then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  210) be used:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  211) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  212) 	*A = 5;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  213) 	x = *D;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  214) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  215) but this might show up as either of the following two sequences:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  216) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  217) 	STORE *A = 5, x = LOAD *D
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  218) 	x = LOAD *D, STORE *A = 5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  219) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  220) the second of which will almost certainly result in a malfunction, since it set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  221) the address _after_ attempting to read the register.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  222) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  223) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  224) GUARANTEES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  225) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  226) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  227) There are some minimal guarantees that may be expected of a CPU:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  228) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  229)  (*) On any given CPU, dependent memory accesses will be issued in order, with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  230)      respect to itself.  This means that for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  231) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  232) 	Q = READ_ONCE(P); D = READ_ONCE(*Q);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  233) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  234)      the CPU will issue the following memory operations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  235) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  236) 	Q = LOAD P, D = LOAD *Q
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  238)      and always in that order.  However, on DEC Alpha, READ_ONCE() also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  239)      emits a memory-barrier instruction, so that a DEC Alpha CPU will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  240)      instead issue the following memory operations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  241) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  242) 	Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  243) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  244)      Whether on DEC Alpha or not, the READ_ONCE() also prevents compiler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  245)      mischief.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  246) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  247)  (*) Overlapping loads and stores within a particular CPU will appear to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  248)      ordered within that CPU.  This means that for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  249) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  250) 	a = READ_ONCE(*X); WRITE_ONCE(*X, b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  251) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  252)      the CPU will only issue the following sequence of memory operations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  254) 	a = LOAD *X, STORE *X = b
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  255) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  256)      And for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  257) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  258) 	WRITE_ONCE(*X, c); d = READ_ONCE(*X);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  259) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  260)      the CPU will only issue:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  261) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  262) 	STORE *X = c, d = LOAD *X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  264)      (Loads and stores overlap if they are targeted at overlapping pieces of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  265)      memory).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  266) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  267) And there are a number of things that _must_ or _must_not_ be assumed:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  268) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  269)  (*) It _must_not_ be assumed that the compiler will do what you want
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  270)      with memory references that are not protected by READ_ONCE() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  271)      WRITE_ONCE().  Without them, the compiler is within its rights to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  272)      do all sorts of "creative" transformations, which are covered in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  273)      the COMPILER BARRIER section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  274) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  275)  (*) It _must_not_ be assumed that independent loads and stores will be issued
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  276)      in the order given.  This means that for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  277) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  278) 	X = *A; Y = *B; *D = Z;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  279) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  280)      we may get any of the following sequences:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  281) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  282) 	X = LOAD *A,  Y = LOAD *B,  STORE *D = Z
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  283) 	X = LOAD *A,  STORE *D = Z, Y = LOAD *B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  284) 	Y = LOAD *B,  X = LOAD *A,  STORE *D = Z
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  285) 	Y = LOAD *B,  STORE *D = Z, X = LOAD *A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  286) 	STORE *D = Z, X = LOAD *A,  Y = LOAD *B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  287) 	STORE *D = Z, Y = LOAD *B,  X = LOAD *A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  288) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  289)  (*) It _must_ be assumed that overlapping memory accesses may be merged or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  290)      discarded.  This means that for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  291) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  292) 	X = *A; Y = *(A + 4);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  294)      we may get any one of the following sequences:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  295) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  296) 	X = LOAD *A; Y = LOAD *(A + 4);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  297) 	Y = LOAD *(A + 4); X = LOAD *A;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  298) 	{X, Y} = LOAD {*A, *(A + 4) };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  299) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  300)      And for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  301) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  302) 	*A = X; *(A + 4) = Y;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  303) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  304)      we may get any of:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  305) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  306) 	STORE *A = X; STORE *(A + 4) = Y;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  307) 	STORE *(A + 4) = Y; STORE *A = X;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  308) 	STORE {*A, *(A + 4) } = {X, Y};
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  309) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  310) And there are anti-guarantees:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  311) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  312)  (*) These guarantees do not apply to bitfields, because compilers often
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  313)      generate code to modify these using non-atomic read-modify-write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  314)      sequences.  Do not attempt to use bitfields to synchronize parallel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  315)      algorithms.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  316) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  317)  (*) Even in cases where bitfields are protected by locks, all fields
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  318)      in a given bitfield must be protected by one lock.  If two fields
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  319)      in a given bitfield are protected by different locks, the compiler's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  320)      non-atomic read-modify-write sequences can cause an update to one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  321)      field to corrupt the value of an adjacent field.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  322) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  323)  (*) These guarantees apply only to properly aligned and sized scalar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  324)      variables.  "Properly sized" currently means variables that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  325)      the same size as "char", "short", "int" and "long".  "Properly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  326)      aligned" means the natural alignment, thus no constraints for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  327)      "char", two-byte alignment for "short", four-byte alignment for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  328)      "int", and either four-byte or eight-byte alignment for "long",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  329)      on 32-bit and 64-bit systems, respectively.  Note that these
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  330)      guarantees were introduced into the C11 standard, so beware when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  331)      using older pre-C11 compilers (for example, gcc 4.6).  The portion
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  332)      of the standard containing this guarantee is Section 3.14, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  333)      defines "memory location" as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  334) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  335)      	memory location
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  336) 		either an object of scalar type, or a maximal sequence
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  337) 		of adjacent bit-fields all having nonzero width
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  338) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  339) 		NOTE 1: Two threads of execution can update and access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  340) 		separate memory locations without interfering with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  341) 		each other.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  342) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  343) 		NOTE 2: A bit-field and an adjacent non-bit-field member
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  344) 		are in separate memory locations. The same applies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  345) 		to two bit-fields, if one is declared inside a nested
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  346) 		structure declaration and the other is not, or if the two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  347) 		are separated by a zero-length bit-field declaration,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  348) 		or if they are separated by a non-bit-field member
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  349) 		declaration. It is not safe to concurrently update two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  350) 		bit-fields in the same structure if all members declared
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  351) 		between them are also bit-fields, no matter what the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  352) 		sizes of those intervening bit-fields happen to be.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  353) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  354) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  355) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  356) WHAT ARE MEMORY BARRIERS?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  357) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  358) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  359) As can be seen above, independent memory operations are effectively performed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  360) in random order, but this can be a problem for CPU-CPU interaction and for I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  361) What is required is some way of intervening to instruct the compiler and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  362) CPU to restrict the order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  363) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  364) Memory barriers are such interventions.  They impose a perceived partial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  365) ordering over the memory operations on either side of the barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  366) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  367) Such enforcement is important because the CPUs and other devices in a system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  368) can use a variety of tricks to improve performance, including reordering,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  369) deferral and combination of memory operations; speculative loads; speculative
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  370) branch prediction and various types of caching.  Memory barriers are used to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  371) override or suppress these tricks, allowing the code to sanely control the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  372) interaction of multiple CPUs and/or devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  373) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  374) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  375) VARIETIES OF MEMORY BARRIER
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  376) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  377) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  378) Memory barriers come in four basic varieties:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  379) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  380)  (1) Write (or store) memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  381) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  382)      A write memory barrier gives a guarantee that all the STORE operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  383)      specified before the barrier will appear to happen before all the STORE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  384)      operations specified after the barrier with respect to the other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  385)      components of the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  386) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  387)      A write barrier is a partial ordering on stores only; it is not required
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  388)      to have any effect on loads.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  389) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  390)      A CPU can be viewed as committing a sequence of store operations to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  391)      memory system as time progresses.  All stores _before_ a write barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  392)      will occur _before_ all the stores after the write barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  393) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  394)      [!] Note that write barriers should normally be paired with read or data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  395)      dependency barriers; see the "SMP barrier pairing" subsection.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  396) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  397) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  398)  (2) Data dependency barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  399) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  400)      A data dependency barrier is a weaker form of read barrier.  In the case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  401)      where two loads are performed such that the second depends on the result
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  402)      of the first (eg: the first load retrieves the address to which the second
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  403)      load will be directed), a data dependency barrier would be required to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  404)      make sure that the target of the second load is updated after the address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  405)      obtained by the first load is accessed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  406) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  407)      A data dependency barrier is a partial ordering on interdependent loads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  408)      only; it is not required to have any effect on stores, independent loads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  409)      or overlapping loads.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  410) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  411)      As mentioned in (1), the other CPUs in the system can be viewed as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  412)      committing sequences of stores to the memory system that the CPU being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  413)      considered can then perceive.  A data dependency barrier issued by the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  414)      under consideration guarantees that for any load preceding it, if that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  415)      load touches one of a sequence of stores from another CPU, then by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  416)      time the barrier completes, the effects of all the stores prior to that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  417)      touched by the load will be perceptible to any loads issued after the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  418)      dependency barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  419) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  420)      See the "Examples of memory barrier sequences" subsection for diagrams
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  421)      showing the ordering constraints.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  422) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  423)      [!] Note that the first load really has to have a _data_ dependency and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  424)      not a control dependency.  If the address for the second load is dependent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  425)      on the first load, but the dependency is through a conditional rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  426)      actually loading the address itself, then it's a _control_ dependency and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  427)      a full read barrier or better is required.  See the "Control dependencies"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  428)      subsection for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  429) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  430)      [!] Note that data dependency barriers should normally be paired with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  431)      write barriers; see the "SMP barrier pairing" subsection.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  432) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  433) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  434)  (3) Read (or load) memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  435) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  436)      A read barrier is a data dependency barrier plus a guarantee that all the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  437)      LOAD operations specified before the barrier will appear to happen before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  438)      all the LOAD operations specified after the barrier with respect to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  439)      other components of the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  440) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  441)      A read barrier is a partial ordering on loads only; it is not required to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  442)      have any effect on stores.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  443) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  444)      Read memory barriers imply data dependency barriers, and so can substitute
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  445)      for them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  446) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  447)      [!] Note that read barriers should normally be paired with write barriers;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  448)      see the "SMP barrier pairing" subsection.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  449) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  450) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  451)  (4) General memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  452) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  453)      A general memory barrier gives a guarantee that all the LOAD and STORE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  454)      operations specified before the barrier will appear to happen before all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  455)      the LOAD and STORE operations specified after the barrier with respect to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  456)      the other components of the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  457) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  458)      A general memory barrier is a partial ordering over both loads and stores.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  459) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  460)      General memory barriers imply both read and write memory barriers, and so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  461)      can substitute for either.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  462) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  463) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  464) And a couple of implicit varieties:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  465) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  466)  (5) ACQUIRE operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  467) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  468)      This acts as a one-way permeable barrier.  It guarantees that all memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  469)      operations after the ACQUIRE operation will appear to happen after the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  470)      ACQUIRE operation with respect to the other components of the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  471)      ACQUIRE operations include LOCK operations and both smp_load_acquire()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  472)      and smp_cond_load_acquire() operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  473) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  474)      Memory operations that occur before an ACQUIRE operation may appear to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  475)      happen after it completes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  476) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  477)      An ACQUIRE operation should almost always be paired with a RELEASE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  478)      operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  479) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  480) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  481)  (6) RELEASE operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  482) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  483)      This also acts as a one-way permeable barrier.  It guarantees that all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  484)      memory operations before the RELEASE operation will appear to happen
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  485)      before the RELEASE operation with respect to the other components of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  486)      system. RELEASE operations include UNLOCK operations and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  487)      smp_store_release() operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  488) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  489)      Memory operations that occur after a RELEASE operation may appear to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  490)      happen before it completes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  491) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  492)      The use of ACQUIRE and RELEASE operations generally precludes the need
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  493)      for other sorts of memory barrier.  In addition, a RELEASE+ACQUIRE pair is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  494)      -not- guaranteed to act as a full memory barrier.  However, after an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  495)      ACQUIRE on a given variable, all memory accesses preceding any prior
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  496)      RELEASE on that same variable are guaranteed to be visible.  In other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  497)      words, within a given variable's critical section, all accesses of all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  498)      previous critical sections for that variable are guaranteed to have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  499)      completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  500) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  501)      This means that ACQUIRE acts as a minimal "acquire" operation and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  502)      RELEASE acts as a minimal "release" operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  503) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  504) A subset of the atomic operations described in atomic_t.txt have ACQUIRE and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  505) RELEASE variants in addition to fully-ordered and relaxed (no barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  506) semantics) definitions.  For compound atomics performing both a load and a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  507) store, ACQUIRE semantics apply only to the load and RELEASE semantics apply
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  508) only to the store portion of the operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  509) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  510) Memory barriers are only required where there's a possibility of interaction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  511) between two CPUs or between a CPU and a device.  If it can be guaranteed that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  512) there won't be any such interaction in any particular piece of code, then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  513) memory barriers are unnecessary in that piece of code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  514) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  515) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  516) Note that these are the _minimum_ guarantees.  Different architectures may give
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  517) more substantial guarantees, but they may _not_ be relied upon outside of arch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  518) specific code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  519) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  520) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  521) WHAT MAY NOT BE ASSUMED ABOUT MEMORY BARRIERS?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  522) ----------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  523) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  524) There are certain things that the Linux kernel memory barriers do not guarantee:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  525) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  526)  (*) There is no guarantee that any of the memory accesses specified before a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  527)      memory barrier will be _complete_ by the completion of a memory barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  528)      instruction; the barrier can be considered to draw a line in that CPU's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  529)      access queue that accesses of the appropriate type may not cross.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  530) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  531)  (*) There is no guarantee that issuing a memory barrier on one CPU will have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  532)      any direct effect on another CPU or any other hardware in the system.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  533)      indirect effect will be the order in which the second CPU sees the effects
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  534)      of the first CPU's accesses occur, but see the next point:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  535) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  536)  (*) There is no guarantee that a CPU will see the correct order of effects
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  537)      from a second CPU's accesses, even _if_ the second CPU uses a memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  538)      barrier, unless the first CPU _also_ uses a matching memory barrier (see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  539)      the subsection on "SMP Barrier Pairing").
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  540) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  541)  (*) There is no guarantee that some intervening piece of off-the-CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  542)      hardware[*] will not reorder the memory accesses.  CPU cache coherency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  543)      mechanisms should propagate the indirect effects of a memory barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  544)      between CPUs, but might not do so in order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  545) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  546) 	[*] For information on bus mastering DMA and coherency please read:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  547) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  548) 	    Documentation/driver-api/pci/pci.rst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  549) 	    Documentation/core-api/dma-api-howto.rst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  550) 	    Documentation/core-api/dma-api.rst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  551) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  552) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  553) DATA DEPENDENCY BARRIERS (HISTORICAL)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  554) -------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  555) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  556) As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  557) DEC Alpha, which means that about the only people who need to pay attention
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  558) to this section are those working on DEC Alpha architecture-specific code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  559) and those working on READ_ONCE() itself.  For those who need it, and for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  560) those who are interested in the history, here is the story of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  561) data-dependency barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  562) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  563) The usage requirements of data dependency barriers are a little subtle, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  564) it's not always obvious that they're needed.  To illustrate, consider the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  565) following sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  566) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  567) 	CPU 1		      CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  568) 	===============	      ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  569) 	{ A == 1, B == 2, C == 3, P == &A, Q == &C }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  570) 	B = 4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  571) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  572) 	WRITE_ONCE(P, &B);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  573) 			      Q = READ_ONCE(P);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  574) 			      D = *Q;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  575) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  576) There's a clear data dependency here, and it would seem that by the end of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  577) sequence, Q must be either &A or &B, and that:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  578) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  579) 	(Q == &A) implies (D == 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  580) 	(Q == &B) implies (D == 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  581) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  582) But!  CPU 2's perception of P may be updated _before_ its perception of B, thus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  583) leading to the following situation:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  584) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  585) 	(Q == &B) and (D == 2) ????
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  586) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  587) While this may seem like a failure of coherency or causality maintenance, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  588) isn't, and this behaviour can be observed on certain real CPUs (such as the DEC
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  589) Alpha).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  590) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  591) To deal with this, a data dependency barrier or better must be inserted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  592) between the address load and the data load:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  593) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  594) 	CPU 1		      CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  595) 	===============	      ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  596) 	{ A == 1, B == 2, C == 3, P == &A, Q == &C }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  597) 	B = 4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  598) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  599) 	WRITE_ONCE(P, &B);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  600) 			      Q = READ_ONCE(P);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  601) 			      <data dependency barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  602) 			      D = *Q;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  603) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  604) This enforces the occurrence of one of the two implications, and prevents the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  605) third possibility from arising.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  606) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  607) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  608) [!] Note that this extremely counterintuitive situation arises most easily on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  609) machines with split caches, so that, for example, one cache bank processes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  610) even-numbered cache lines and the other bank processes odd-numbered cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  611) lines.  The pointer P might be stored in an odd-numbered cache line, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  612) variable B might be stored in an even-numbered cache line.  Then, if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  613) even-numbered bank of the reading CPU's cache is extremely busy while the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  614) odd-numbered bank is idle, one can see the new value of the pointer P (&B),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  615) but the old value of the variable B (2).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  616) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  617) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  618) A data-dependency barrier is not required to order dependent writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  619) because the CPUs that the Linux kernel supports don't do writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  620) until they are certain (1) that the write will actually happen, (2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  621) of the location of the write, and (3) of the value to be written.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  622) But please carefully read the "CONTROL DEPENDENCIES" section and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  623) Documentation/RCU/rcu_dereference.rst file:  The compiler can and does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  624) break dependencies in a great many highly creative ways.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  625) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  626) 	CPU 1		      CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  627) 	===============	      ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  628) 	{ A == 1, B == 2, C = 3, P == &A, Q == &C }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  629) 	B = 4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  630) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  631) 	WRITE_ONCE(P, &B);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  632) 			      Q = READ_ONCE(P);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  633) 			      WRITE_ONCE(*Q, 5);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  634) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  635) Therefore, no data-dependency barrier is required to order the read into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  636) Q with the store into *Q.  In other words, this outcome is prohibited,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  637) even without a data-dependency barrier:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  638) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  639) 	(Q == &B) && (B == 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  640) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  641) Please note that this pattern should be rare.  After all, the whole point
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  642) of dependency ordering is to -prevent- writes to the data structure, along
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  643) with the expensive cache misses associated with those writes.  This pattern
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  644) can be used to record rare error conditions and the like, and the CPUs'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  645) naturally occurring ordering prevents such records from being lost.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  646) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  647) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  648) Note well that the ordering provided by a data dependency is local to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  649) the CPU containing it.  See the section on "Multicopy atomicity" for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  650) more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  651) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  652) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  653) The data dependency barrier is very important to the RCU system,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  654) for example.  See rcu_assign_pointer() and rcu_dereference() in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  655) include/linux/rcupdate.h.  This permits the current target of an RCU'd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  656) pointer to be replaced with a new modified target, without the replacement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  657) target appearing to be incompletely initialised.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  658) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  659) See also the subsection on "Cache Coherency" for a more thorough example.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  660) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  661) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  662) CONTROL DEPENDENCIES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  663) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  664) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  665) Control dependencies can be a bit tricky because current compilers do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  666) not understand them.  The purpose of this section is to help you prevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  667) the compiler's ignorance from breaking your code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  668) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  669) A load-load control dependency requires a full read memory barrier, not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  670) simply a data dependency barrier to make it work correctly.  Consider the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  671) following bit of code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  672) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  673) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  674) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  675) 		<data dependency barrier>  /* BUG: No data dependency!!! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  676) 		p = READ_ONCE(b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  677) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  678) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  679) This will not have the desired effect because there is no actual data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  680) dependency, but rather a control dependency that the CPU may short-circuit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  681) by attempting to predict the outcome in advance, so that other CPUs see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  682) the load from b as having happened before the load from a.  In such a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  683) case what's actually required is:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  684) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  685) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  686) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  687) 		<read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  688) 		p = READ_ONCE(b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  689) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  690) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  691) However, stores are not speculated.  This means that ordering -is- provided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  692) for load-store control dependencies, as in the following example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  693) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  694) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  695) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  696) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  697) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  698) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  699) Control dependencies pair normally with other types of barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  700) That said, please note that neither READ_ONCE() nor WRITE_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  701) are optional! Without the READ_ONCE(), the compiler might combine the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  702) load from 'a' with other loads from 'a'.  Without the WRITE_ONCE(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  703) the compiler might combine the store to 'b' with other stores to 'b'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  704) Either can result in highly counterintuitive effects on ordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  705) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  706) Worse yet, if the compiler is able to prove (say) that the value of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  707) variable 'a' is always non-zero, it would be well within its rights
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  708) to optimize the original example by eliminating the "if" statement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  709) as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  710) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  711) 	q = a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  712) 	b = 1;  /* BUG: Compiler and CPU can both reorder!!! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  713) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  714) So don't leave out the READ_ONCE().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  715) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  716) It is tempting to try to enforce ordering on identical stores on both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  717) branches of the "if" statement as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  718) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  719) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  720) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  721) 		barrier();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  722) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  723) 		do_something();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  724) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  725) 		barrier();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  726) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  727) 		do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  728) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  729) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  730) Unfortunately, current compilers will transform this as follows at high
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  731) optimization levels:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  732) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  733) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  734) 	barrier();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  735) 	WRITE_ONCE(b, 1);  /* BUG: No ordering vs. load from a!!! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  736) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  737) 		/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  738) 		do_something();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  739) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  740) 		/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  741) 		do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  742) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  743) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  744) Now there is no conditional between the load from 'a' and the store to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  745) 'b', which means that the CPU is within its rights to reorder them:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  746) The conditional is absolutely required, and must be present in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  747) assembly code even after all compiler optimizations have been applied.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  748) Therefore, if you need ordering in this example, you need explicit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  749) memory barriers, for example, smp_store_release():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  750) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  751) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  752) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  753) 		smp_store_release(&b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  754) 		do_something();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  755) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  756) 		smp_store_release(&b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  757) 		do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  758) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  759) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  760) In contrast, without explicit memory barriers, two-legged-if control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  761) ordering is guaranteed only when the stores differ, for example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  762) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  763) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  764) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  765) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  766) 		do_something();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  767) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  768) 		WRITE_ONCE(b, 2);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  769) 		do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  770) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  771) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  772) The initial READ_ONCE() is still required to prevent the compiler from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  773) proving the value of 'a'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  774) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  775) In addition, you need to be careful what you do with the local variable 'q',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  776) otherwise the compiler might be able to guess the value and again remove
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  777) the needed conditional.  For example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  778) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  779) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  780) 	if (q % MAX) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  781) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  782) 		do_something();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  783) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  784) 		WRITE_ONCE(b, 2);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  785) 		do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  786) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  787) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  788) If MAX is defined to be 1, then the compiler knows that (q % MAX) is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  789) equal to zero, in which case the compiler is within its rights to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  790) transform the above code into the following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  791) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  792) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  793) 	WRITE_ONCE(b, 2);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  794) 	do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  795) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  796) Given this transformation, the CPU is not required to respect the ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  797) between the load from variable 'a' and the store to variable 'b'.  It is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  798) tempting to add a barrier(), but this does not help.  The conditional
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  799) is gone, and the barrier won't bring it back.  Therefore, if you are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  800) relying on this ordering, you should make sure that MAX is greater than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  801) one, perhaps as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  802) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  803) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  804) 	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  805) 	if (q % MAX) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  806) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  807) 		do_something();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  808) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  809) 		WRITE_ONCE(b, 2);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  810) 		do_something_else();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  811) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  812) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  813) Please note once again that the stores to 'b' differ.  If they were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  814) identical, as noted earlier, the compiler could pull this store outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  815) of the 'if' statement.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  816) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  817) You must also be careful not to rely too much on boolean short-circuit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  818) evaluation.  Consider this example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  819) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  820) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  821) 	if (q || 1 > 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  822) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  823) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  824) Because the first condition cannot fault and the second condition is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  825) always true, the compiler can transform this example as following,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  826) defeating control dependency:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  827) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  828) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  829) 	WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  830) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  831) This example underscores the need to ensure that the compiler cannot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  832) out-guess your code.  More generally, although READ_ONCE() does force
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  833) the compiler to actually emit code for a given load, it does not force
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  834) the compiler to use the results.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  835) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  836) In addition, control dependencies apply only to the then-clause and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  837) else-clause of the if-statement in question.  In particular, it does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  838) not necessarily apply to code following the if-statement:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  839) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  840) 	q = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  841) 	if (q) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  842) 		WRITE_ONCE(b, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  843) 	} else {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  844) 		WRITE_ONCE(b, 2);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  845) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  846) 	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from 'a'. */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  847) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  848) It is tempting to argue that there in fact is ordering because the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  849) compiler cannot reorder volatile accesses and also cannot reorder
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  850) the writes to 'b' with the condition.  Unfortunately for this line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  851) of reasoning, the compiler might compile the two writes to 'b' as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  852) conditional-move instructions, as in this fanciful pseudo-assembly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  853) language:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  854) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  855) 	ld r1,a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  856) 	cmp r1,$0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  857) 	cmov,ne r4,$1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  858) 	cmov,eq r4,$2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  859) 	st r4,b
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  860) 	st $1,c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  861) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  862) A weakly ordered CPU would have no dependency of any sort between the load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  863) from 'a' and the store to 'c'.  The control dependencies would extend
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  864) only to the pair of cmov instructions and the store depending on them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  865) In short, control dependencies apply only to the stores in the then-clause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  866) and else-clause of the if-statement in question (including functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  867) invoked by those two clauses), not to code following that if-statement.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  868) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  869) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  870) Note well that the ordering provided by a control dependency is local
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  871) to the CPU containing it.  See the section on "Multicopy atomicity"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  872) for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  873) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  874) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  875) In summary:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  876) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  877)   (*) Control dependencies can order prior loads against later stores.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  878)       However, they do -not- guarantee any other sort of ordering:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  879)       Not prior loads against later loads, nor prior stores against
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  880)       later anything.  If you need these other forms of ordering,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  881)       use smp_rmb(), smp_wmb(), or, in the case of prior stores and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  882)       later loads, smp_mb().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  883) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  884)   (*) If both legs of the "if" statement begin with identical stores to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  885)       the same variable, then those stores must be ordered, either by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  886)       preceding both of them with smp_mb() or by using smp_store_release()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  887)       to carry out the stores.  Please note that it is -not- sufficient
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  888)       to use barrier() at beginning of each leg of the "if" statement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  889)       because, as shown by the example above, optimizing compilers can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  890)       destroy the control dependency while respecting the letter of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  891)       barrier() law.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  892) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  893)   (*) Control dependencies require at least one run-time conditional
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  894)       between the prior load and the subsequent store, and this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  895)       conditional must involve the prior load.  If the compiler is able
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  896)       to optimize the conditional away, it will have also optimized
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  897)       away the ordering.  Careful use of READ_ONCE() and WRITE_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  898)       can help to preserve the needed conditional.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  899) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  900)   (*) Control dependencies require that the compiler avoid reordering the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  901)       dependency into nonexistence.  Careful use of READ_ONCE() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  902)       atomic{,64}_read() can help to preserve your control dependency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  903)       Please see the COMPILER BARRIER section for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  904) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  905)   (*) Control dependencies apply only to the then-clause and else-clause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  906)       of the if-statement containing the control dependency, including
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  907)       any functions that these two clauses call.  Control dependencies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  908)       do -not- apply to code following the if-statement containing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  909)       control dependency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  910) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  911)   (*) Control dependencies pair normally with other types of barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  912) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  913)   (*) Control dependencies do -not- provide multicopy atomicity.  If you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  914)       need all the CPUs to see a given store at the same time, use smp_mb().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  915) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  916)   (*) Compilers do not understand control dependencies.  It is therefore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  917)       your job to ensure that they do not break your code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  918) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  919) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  920) SMP BARRIER PAIRING
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  921) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  922) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  923) When dealing with CPU-CPU interactions, certain types of memory barrier should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  924) always be paired.  A lack of appropriate pairing is almost certainly an error.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  925) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  926) General barriers pair with each other, though they also pair with most
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  927) other types of barriers, albeit without multicopy atomicity.  An acquire
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  928) barrier pairs with a release barrier, but both may also pair with other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  929) barriers, including of course general barriers.  A write barrier pairs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  930) with a data dependency barrier, a control dependency, an acquire barrier,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  931) a release barrier, a read barrier, or a general barrier.  Similarly a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  932) read barrier, control dependency, or a data dependency barrier pairs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  933) with a write barrier, an acquire barrier, a release barrier, or a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  934) general barrier:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  935) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  936) 	CPU 1		      CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  937) 	===============	      ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  938) 	WRITE_ONCE(a, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  939) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  940) 	WRITE_ONCE(b, 2);     x = READ_ONCE(b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  941) 			      <read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  942) 			      y = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  943) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  944) Or:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  945) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  946) 	CPU 1		      CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  947) 	===============	      ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  948) 	a = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  949) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  950) 	WRITE_ONCE(b, &a);    x = READ_ONCE(b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  951) 			      <data dependency barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  952) 			      y = *x;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  953) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  954) Or even:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  955) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  956) 	CPU 1		      CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  957) 	===============	      ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  958) 	r1 = READ_ONCE(y);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  959) 	<general barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  960) 	WRITE_ONCE(x, 1);     if (r2 = READ_ONCE(x)) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  961) 			         <implicit control dependency>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  962) 			         WRITE_ONCE(y, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  963) 			      }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  964) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  965) 	assert(r1 == 0 || r2 == 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  966) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  967) Basically, the read barrier always has to be there, even though it can be of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  968) the "weaker" type.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  969) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  970) [!] Note that the stores before the write barrier would normally be expected to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  971) match the loads after the read barrier or the data dependency barrier, and vice
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  972) versa:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  973) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  974) 	CPU 1                               CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  975) 	===================                 ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  976) 	WRITE_ONCE(a, 1);    }----   --->{  v = READ_ONCE(c);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  977) 	WRITE_ONCE(b, 2);    }    \ /    {  w = READ_ONCE(d);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  978) 	<write barrier>            \        <read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  979) 	WRITE_ONCE(c, 3);    }    / \    {  x = READ_ONCE(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  980) 	WRITE_ONCE(d, 4);    }----   --->{  y = READ_ONCE(b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  981) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  982) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  983) EXAMPLES OF MEMORY BARRIER SEQUENCES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  984) ------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  985) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  986) Firstly, write barriers act as partial orderings on store operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  987) Consider the following sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  988) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  989) 	CPU 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  990) 	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  991) 	STORE A = 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  992) 	STORE B = 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  993) 	STORE C = 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  994) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  995) 	STORE D = 4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  996) 	STORE E = 5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  997) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  998) This sequence of events is committed to the memory coherence system in an order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  999) that the rest of the system might perceive as the unordered set of { STORE A,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1000) STORE B, STORE C } all occurring before the unordered set of { STORE D, STORE E
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1001) }:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1002) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1003) 	+-------+       :      :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1004) 	|       |       +------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1005) 	|       |------>| C=3  |     }     /\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1006) 	|       |  :    +------+     }-----  \  -----> Events perceptible to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1007) 	|       |  :    | A=1  |     }        \/       the rest of the system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1008) 	|       |  :    +------+     }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1009) 	| CPU 1 |  :    | B=2  |     }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1010) 	|       |       +------+     }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1011) 	|       |   wwwwwwwwwwwwwwww }   <--- At this point the write barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1012) 	|       |       +------+     }        requires all stores prior to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1013) 	|       |  :    | E=5  |     }        barrier to be committed before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1014) 	|       |  :    +------+     }        further stores may take place
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1015) 	|       |------>| D=4  |     }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1016) 	|       |       +------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1017) 	+-------+       :      :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1018) 	                   |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1019) 	                   | Sequence in which stores are committed to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1020) 	                   | memory system by CPU 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1021) 	                   V
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1022) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1023) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1024) Secondly, data dependency barriers act as partial orderings on data-dependent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1025) loads.  Consider the following sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1026) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1027) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1028) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1029) 		{ B = 7; X = 9; Y = 8; C = &Y }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1030) 	STORE A = 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1031) 	STORE B = 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1032) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1033) 	STORE C = &B		LOAD X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1034) 	STORE D = 4		LOAD C (gets &B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1035) 				LOAD *C (reads B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1036) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1037) Without intervention, CPU 2 may perceive the events on CPU 1 in some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1038) effectively random order, despite the write barrier issued by CPU 1:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1039) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1040) 	+-------+       :      :                :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1041) 	|       |       +------+                +-------+  | Sequence of update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1042) 	|       |------>| B=2  |-----       --->| Y->8  |  | of perception on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1043) 	|       |  :    +------+     \          +-------+  | CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1044) 	| CPU 1 |  :    | A=1  |      \     --->| C->&Y |  V
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1045) 	|       |       +------+       |        +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1046) 	|       |   wwwwwwwwwwwwwwww   |        :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1047) 	|       |       +------+       |        :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1048) 	|       |  :    | C=&B |---    |        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1049) 	|       |  :    +------+   \   |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1050) 	|       |------>| D=4  |    ----------->| C->&B |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1051) 	|       |       +------+       |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1052) 	+-------+       :      :       |        :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1053) 	                               |        :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1054) 	                               |        :       :       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1055) 	                               |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1056) 	    Apparently incorrect --->  |        | B->7  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1057) 	    perception of B (!)        |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1058) 	                               |        :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1059) 	                               |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1060) 	    The load of X holds --->    \       | X->9  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1061) 	    up the maintenance           \      +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1062) 	    of coherence of B             ----->| B->2  |       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1063) 	                                        +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1064) 	                                        :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1065) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1066) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1067) In the above example, CPU 2 perceives that B is 7, despite the load of *C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1068) (which would be B) coming after the LOAD of C.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1069) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1070) If, however, a data dependency barrier were to be placed between the load of C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1071) and the load of *C (ie: B) on CPU 2:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1072) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1073) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1074) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1075) 		{ B = 7; X = 9; Y = 8; C = &Y }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1076) 	STORE A = 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1077) 	STORE B = 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1078) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1079) 	STORE C = &B		LOAD X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1080) 	STORE D = 4		LOAD C (gets &B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1081) 				<data dependency barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1082) 				LOAD *C (reads B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1083) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1084) then the following will occur:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1085) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1086) 	+-------+       :      :                :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1087) 	|       |       +------+                +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1088) 	|       |------>| B=2  |-----       --->| Y->8  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1089) 	|       |  :    +------+     \          +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1090) 	| CPU 1 |  :    | A=1  |      \     --->| C->&Y |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1091) 	|       |       +------+       |        +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1092) 	|       |   wwwwwwwwwwwwwwww   |        :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1093) 	|       |       +------+       |        :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1094) 	|       |  :    | C=&B |---    |        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1095) 	|       |  :    +------+   \   |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1096) 	|       |------>| D=4  |    ----------->| C->&B |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1097) 	|       |       +------+       |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1098) 	+-------+       :      :       |        :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1099) 	                               |        :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1100) 	                               |        :       :       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1101) 	                               |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1102) 	                               |        | X->9  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1103) 	                               |        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1104) 	  Makes sure all effects --->   \   ddddddddddddddddd   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1105) 	  prior to the store of C        \      +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1106) 	  are perceptible to              ----->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1107) 	  subsequent loads                      +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1108) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1111) And thirdly, a read barrier acts as a partial order on loads.  Consider the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1112) following sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1114) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1115) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1116) 		{ A = 0, B = 9 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1117) 	STORE A=1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1118) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1119) 	STORE B=2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1120) 				LOAD B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1121) 				LOAD A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1122) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1123) Without intervention, CPU 2 may then choose to perceive the events on CPU 1 in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1124) some effectively random order, despite the write barrier issued by CPU 1:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1126) 	+-------+       :      :                :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1127) 	|       |       +------+                +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1128) 	|       |------>| A=1  |------      --->| A->0  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1129) 	|       |       +------+      \         +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1130) 	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1131) 	|       |       +------+        |       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1132) 	|       |------>| B=2  |---     |       :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1133) 	|       |       +------+   \    |       :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1134) 	+-------+       :      :    \   |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1135) 	                             ---------->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1136) 	                                |       +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1137) 	                                |       | A->0  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1138) 	                                |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1139) 	                                |       :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1140) 	                                 \      :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1141) 	                                  \     +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1142) 	                                   ---->| A->1  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1143) 	                                        +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1144) 	                                        :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1146) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1147) If, however, a read barrier were to be placed between the load of B and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1148) load of A on CPU 2:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1150) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1151) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1152) 		{ A = 0, B = 9 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1153) 	STORE A=1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1154) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1155) 	STORE B=2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1156) 				LOAD B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1157) 				<read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1158) 				LOAD A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1159) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1160) then the partial ordering imposed by CPU 1 will be perceived correctly by CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1161) 2:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1162) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1163) 	+-------+       :      :                :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1164) 	|       |       +------+                +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1165) 	|       |------>| A=1  |------      --->| A->0  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1166) 	|       |       +------+      \         +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1167) 	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1168) 	|       |       +------+        |       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1169) 	|       |------>| B=2  |---     |       :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1170) 	|       |       +------+   \    |       :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1171) 	+-------+       :      :    \   |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1172) 	                             ---------->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1173) 	                                |       +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1174) 	                                |       :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1175) 	                                |       :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1176) 	  At this point the read ---->   \  rrrrrrrrrrrrrrrrr   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1177) 	  barrier causes all effects      \     +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1178) 	  prior to the storage of B        ---->| A->1  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1179) 	  to be perceptible to CPU 2            +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1180) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1181) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1183) To illustrate this more completely, consider what could happen if the code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1184) contained a load of A either side of the read barrier:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1185) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1186) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1187) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1188) 		{ A = 0, B = 9 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1189) 	STORE A=1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1190) 	<write barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1191) 	STORE B=2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1192) 				LOAD B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1193) 				LOAD A [first load of A]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1194) 				<read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1195) 				LOAD A [second load of A]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1196) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1197) Even though the two loads of A both occur after the load of B, they may both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1198) come up with different values:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1199) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1200) 	+-------+       :      :                :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1201) 	|       |       +------+                +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1202) 	|       |------>| A=1  |------      --->| A->0  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1203) 	|       |       +------+      \         +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1204) 	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1205) 	|       |       +------+        |       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1206) 	|       |------>| B=2  |---     |       :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1207) 	|       |       +------+   \    |       :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1208) 	+-------+       :      :    \   |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1209) 	                             ---------->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1210) 	                                |       +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1211) 	                                |       :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1212) 	                                |       :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1213) 	                                |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1214) 	                                |       | A->0  |------>| 1st   |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1215) 	                                |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1216) 	  At this point the read ---->   \  rrrrrrrrrrrrrrrrr   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1217) 	  barrier causes all effects      \     +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1218) 	  prior to the storage of B        ---->| A->1  |------>| 2nd   |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1219) 	  to be perceptible to CPU 2            +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1220) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1221) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1222) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1223) But it may be that the update to A from CPU 1 becomes perceptible to CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1224) before the read barrier completes anyway:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1225) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1226) 	+-------+       :      :                :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1227) 	|       |       +------+                +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1228) 	|       |------>| A=1  |------      --->| A->0  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1229) 	|       |       +------+      \         +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1230) 	| CPU 1 |   wwwwwwwwwwwwwwww   \    --->| B->9  |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1231) 	|       |       +------+        |       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1232) 	|       |------>| B=2  |---     |       :       :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1233) 	|       |       +------+   \    |       :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1234) 	+-------+       :      :    \   |       +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1235) 	                             ---------->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1236) 	                                |       +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1237) 	                                |       :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1238) 	                                 \      :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1239) 	                                  \     +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1240) 	                                   ---->| A->1  |------>| 1st   |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1241) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1242) 	                                    rrrrrrrrrrrrrrrrr   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1243) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1244) 	                                        | A->1  |------>| 2nd   |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1245) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1246) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1247) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1248) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1249) The guarantee is that the second load will always come up with A == 1 if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1250) load of B came up with B == 2.  No such guarantee exists for the first load of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1251) A; that may come up with either A == 0 or A == 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1252) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1254) READ MEMORY BARRIERS VS LOAD SPECULATION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1255) ----------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1256) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1257) Many CPUs speculate with loads: that is they see that they will need to load an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1258) item from memory, and they find a time where they're not using the bus for any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1259) other loads, and so do the load in advance - even though they haven't actually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1260) got to that point in the instruction execution flow yet.  This permits the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1261) actual load instruction to potentially complete immediately because the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1262) already has the value to hand.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1264) It may turn out that the CPU didn't actually need the value - perhaps because a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1265) branch circumvented the load - in which case it can discard the value or just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1266) cache it for later use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1267) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1268) Consider:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1269) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1270) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1271) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1272) 				LOAD B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1273) 				DIVIDE		} Divide instructions generally
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1274) 				DIVIDE		} take a long time to perform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1275) 				LOAD A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1277) Which might appear as this:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1278) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1279) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1280) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1281) 	                                    --->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1282) 	                                        +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1283) 	                                        :       :DIVIDE |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1284) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1285) 	The CPU being busy doing a --->     --->| A->0  |~~~~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1286) 	division speculates on the              +-------+   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1287) 	LOAD of A                               :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1288) 	                                        :       :DIVIDE |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1289) 	                                        :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1290) 	Once the divisions are complete -->     :       :   ~-->|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1291) 	the CPU can then perform the            :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1292) 	LOAD with immediate effect              :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1294) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1295) Placing a read barrier or a data dependency barrier just before the second
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1296) load:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1297) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1298) 	CPU 1			CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1299) 	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1300) 				LOAD B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1301) 				DIVIDE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1302) 				DIVIDE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1303) 				<read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1304) 				LOAD A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1305) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1306) will force any value speculatively obtained to be reconsidered to an extent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1307) dependent on the type of barrier used.  If there was no change made to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1308) speculated memory location, then the speculated value will just be used:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1309) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1310) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1311) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1312) 	                                    --->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1313) 	                                        +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1314) 	                                        :       :DIVIDE |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1315) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1316) 	The CPU being busy doing a --->     --->| A->0  |~~~~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1317) 	division speculates on the              +-------+   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1318) 	LOAD of A                               :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1319) 	                                        :       :DIVIDE |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1320) 	                                        :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1321) 	                                        :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1322) 	                                    rrrrrrrrrrrrrrrr~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1323) 	                                        :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1324) 	                                        :       :   ~-->|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1325) 	                                        :       :       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1326) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1327) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1328) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1329) but if there was an update or an invalidation from another CPU pending, then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1330) the speculation will be cancelled and the value reloaded:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1331) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1332) 	                                        :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1333) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1334) 	                                    --->| B->2  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1335) 	                                        +-------+       | CPU 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1336) 	                                        :       :DIVIDE |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1337) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1338) 	The CPU being busy doing a --->     --->| A->0  |~~~~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1339) 	division speculates on the              +-------+   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1340) 	LOAD of A                               :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1341) 	                                        :       :DIVIDE |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1342) 	                                        :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1343) 	                                        :       :   ~   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1344) 	                                    rrrrrrrrrrrrrrrrr   |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1345) 	                                        +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1346) 	The speculation is discarded --->   --->| A->1  |------>|       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1347) 	and an updated value is                 +-------+       |       |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1348) 	retrieved                               :       :       +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1349) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1350) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1351) MULTICOPY ATOMICITY
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1352) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1353) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1354) Multicopy atomicity is a deeply intuitive notion about ordering that is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1355) not always provided by real computer systems, namely that a given store
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1356) becomes visible at the same time to all CPUs, or, alternatively, that all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1357) CPUs agree on the order in which all stores become visible.  However,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1358) support of full multicopy atomicity would rule out valuable hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1359) optimizations, so a weaker form called ``other multicopy atomicity''
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1360) instead guarantees only that a given store becomes visible at the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1361) time to all -other- CPUs.  The remainder of this document discusses this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1362) weaker form, but for brevity will call it simply ``multicopy atomicity''.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1363) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1364) The following example demonstrates multicopy atomicity:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1365) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1366) 	CPU 1			CPU 2			CPU 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1367) 	=======================	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1368) 		{ X = 0, Y = 0 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1369) 	STORE X=1		r1=LOAD X (reads 1)	LOAD Y (reads 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1370) 				<general barrier>	<read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1371) 				STORE Y=r1		LOAD X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1372) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1373) Suppose that CPU 2's load from X returns 1, which it then stores to Y,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1374) and CPU 3's load from Y returns 1.  This indicates that CPU 1's store
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1375) to X precedes CPU 2's load from X and that CPU 2's store to Y precedes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1376) CPU 3's load from Y.  In addition, the memory barriers guarantee that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1377) CPU 2 executes its load before its store, and CPU 3 loads from Y before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1378) it loads from X.  The question is then "Can CPU 3's load from X return 0?"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1379) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1380) Because CPU 3's load from X in some sense comes after CPU 2's load, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1381) is natural to expect that CPU 3's load from X must therefore return 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1382) This expectation follows from multicopy atomicity: if a load executing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1383) on CPU B follows a load from the same variable executing on CPU A (and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1384) CPU A did not originally store the value which it read), then on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1385) multicopy-atomic systems, CPU B's load must return either the same value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1386) that CPU A's load did or some later value.  However, the Linux kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1387) does not require systems to be multicopy atomic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1388) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1389) The use of a general memory barrier in the example above compensates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1390) for any lack of multicopy atomicity.  In the example, if CPU 2's load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1391) from X returns 1 and CPU 3's load from Y returns 1, then CPU 3's load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1392) from X must indeed also return 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1393) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1394) However, dependencies, read barriers, and write barriers are not always
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1395) able to compensate for non-multicopy atomicity.  For example, suppose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1396) that CPU 2's general barrier is removed from the above example, leaving
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1397) only the data dependency shown below:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1398) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1399) 	CPU 1			CPU 2			CPU 3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1400) 	=======================	=======================	=======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1401) 		{ X = 0, Y = 0 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1402) 	STORE X=1		r1=LOAD X (reads 1)	LOAD Y (reads 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1403) 				<data dependency>	<read barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1404) 				STORE Y=r1		LOAD X (reads 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1405) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1406) This substitution allows non-multicopy atomicity to run rampant: in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1407) this example, it is perfectly legal for CPU 2's load from X to return 1,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1408) CPU 3's load from Y to return 1, and its load from X to return 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1409) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1410) The key point is that although CPU 2's data dependency orders its load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1411) and store, it does not guarantee to order CPU 1's store.  Thus, if this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1412) example runs on a non-multicopy-atomic system where CPUs 1 and 2 share a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1413) store buffer or a level of cache, CPU 2 might have early access to CPU 1's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1414) writes.  General barriers are therefore required to ensure that all CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1415) agree on the combined order of multiple accesses.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1416) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1417) General barriers can compensate not only for non-multicopy atomicity,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1418) but can also generate additional ordering that can ensure that -all-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1419) CPUs will perceive the same order of -all- operations.  In contrast, a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1420) chain of release-acquire pairs do not provide this additional ordering,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1421) which means that only those CPUs on the chain are guaranteed to agree
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1422) on the combined order of the accesses.  For example, switching to C code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1423) in deference to the ghost of Herman Hollerith:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1424) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1425) 	int u, v, x, y, z;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1426) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1427) 	void cpu0(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1428) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1429) 		r0 = smp_load_acquire(&x);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1430) 		WRITE_ONCE(u, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1431) 		smp_store_release(&y, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1432) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1433) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1434) 	void cpu1(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1435) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1436) 		r1 = smp_load_acquire(&y);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1437) 		r4 = READ_ONCE(v);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1438) 		r5 = READ_ONCE(u);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1439) 		smp_store_release(&z, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1440) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1441) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1442) 	void cpu2(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1443) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1444) 		r2 = smp_load_acquire(&z);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1445) 		smp_store_release(&x, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1446) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1447) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1448) 	void cpu3(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1449) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1450) 		WRITE_ONCE(v, 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1451) 		smp_mb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1452) 		r3 = READ_ONCE(u);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1453) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1454) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1455) Because cpu0(), cpu1(), and cpu2() participate in a chain of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1456) smp_store_release()/smp_load_acquire() pairs, the following outcome
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1457) is prohibited:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1458) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1459) 	r0 == 1 && r1 == 1 && r2 == 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1460) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1461) Furthermore, because of the release-acquire relationship between cpu0()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1462) and cpu1(), cpu1() must see cpu0()'s writes, so that the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1463) outcome is prohibited:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1464) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1465) 	r1 == 1 && r5 == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1466) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1467) However, the ordering provided by a release-acquire chain is local
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1468) to the CPUs participating in that chain and does not apply to cpu3(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1469) at least aside from stores.  Therefore, the following outcome is possible:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1470) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1471) 	r0 == 0 && r1 == 1 && r2 == 1 && r3 == 0 && r4 == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1472) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1473) As an aside, the following outcome is also possible:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1474) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1475) 	r0 == 0 && r1 == 1 && r2 == 1 && r3 == 0 && r4 == 0 && r5 == 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1476) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1477) Although cpu0(), cpu1(), and cpu2() will see their respective reads and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1478) writes in order, CPUs not involved in the release-acquire chain might
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1479) well disagree on the order.  This disagreement stems from the fact that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1480) the weak memory-barrier instructions used to implement smp_load_acquire()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1481) and smp_store_release() are not required to order prior stores against
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1482) subsequent loads in all cases.  This means that cpu3() can see cpu0()'s
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1483) store to u as happening -after- cpu1()'s load from v, even though
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1484) both cpu0() and cpu1() agree that these two operations occurred in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1485) intended order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1486) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1487) However, please keep in mind that smp_load_acquire() is not magic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1488) In particular, it simply reads from its argument with ordering.  It does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1489) -not- ensure that any particular value will be read.  Therefore, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1490) following outcome is possible:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1491) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1492) 	r0 == 0 && r1 == 0 && r2 == 0 && r5 == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1493) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1494) Note that this outcome can happen even on a mythical sequentially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1495) consistent system where nothing is ever reordered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1496) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1497) To reiterate, if your code requires full ordering of all operations,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1498) use general barriers throughout.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1499) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1500) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1501) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1502) EXPLICIT KERNEL BARRIERS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1503) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1504) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1505) The Linux kernel has a variety of different barriers that act at different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1506) levels:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1507) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1508)   (*) Compiler barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1509) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1510)   (*) CPU memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1511) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1512) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1513) COMPILER BARRIER
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1514) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1515) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1516) The Linux kernel has an explicit compiler barrier function that prevents the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1517) compiler from moving the memory accesses either side of it to the other side:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1518) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1519) 	barrier();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1520) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1521) This is a general barrier -- there are no read-read or write-write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1522) variants of barrier().  However, READ_ONCE() and WRITE_ONCE() can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1523) thought of as weak forms of barrier() that affect only the specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1524) accesses flagged by the READ_ONCE() or WRITE_ONCE().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1525) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1526) The barrier() function has the following effects:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1527) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1528)  (*) Prevents the compiler from reordering accesses following the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1529)      barrier() to precede any accesses preceding the barrier().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1530)      One example use for this property is to ease communication between
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1531)      interrupt-handler code and the code that was interrupted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1532) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1533)  (*) Within a loop, forces the compiler to load the variables used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1534)      in that loop's conditional on each pass through that loop.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1535) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1536) The READ_ONCE() and WRITE_ONCE() functions can prevent any number of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1537) optimizations that, while perfectly safe in single-threaded code, can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1538) be fatal in concurrent code.  Here are some examples of these sorts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1539) of optimizations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1540) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1541)  (*) The compiler is within its rights to reorder loads and stores
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1542)      to the same variable, and in some cases, the CPU is within its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1543)      rights to reorder loads to the same variable.  This means that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1544)      the following code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1545) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1546) 	a[0] = x;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1547) 	a[1] = x;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1548) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1549)      Might result in an older value of x stored in a[1] than in a[0].
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1550)      Prevent both the compiler and the CPU from doing this as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1551) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1552) 	a[0] = READ_ONCE(x);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1553) 	a[1] = READ_ONCE(x);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1554) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1555)      In short, READ_ONCE() and WRITE_ONCE() provide cache coherence for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1556)      accesses from multiple CPUs to a single variable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1557) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1558)  (*) The compiler is within its rights to merge successive loads from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1559)      the same variable.  Such merging can cause the compiler to "optimize"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1560)      the following code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1561) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1562) 	while (tmp = a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1563) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1564) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1565)      into the following code, which, although in some sense legitimate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1566)      for single-threaded code, is almost certainly not what the developer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1567)      intended:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1568) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1569) 	if (tmp = a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1570) 		for (;;)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1571) 			do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1572) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1573)      Use READ_ONCE() to prevent the compiler from doing this to you:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1574) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1575) 	while (tmp = READ_ONCE(a))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1576) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1577) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1578)  (*) The compiler is within its rights to reload a variable, for example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1579)      in cases where high register pressure prevents the compiler from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1580)      keeping all data of interest in registers.  The compiler might
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1581)      therefore optimize the variable 'tmp' out of our previous example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1582) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1583) 	while (tmp = a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1584) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1585) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1586)      This could result in the following code, which is perfectly safe in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1587)      single-threaded code, but can be fatal in concurrent code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1588) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1589) 	while (a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1590) 		do_something_with(a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1591) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1592)      For example, the optimized version of this code could result in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1593)      passing a zero to do_something_with() in the case where the variable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1594)      a was modified by some other CPU between the "while" statement and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1595)      the call to do_something_with().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1596) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1597)      Again, use READ_ONCE() to prevent the compiler from doing this:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1598) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1599) 	while (tmp = READ_ONCE(a))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1600) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1601) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1602)      Note that if the compiler runs short of registers, it might save
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1603)      tmp onto the stack.  The overhead of this saving and later restoring
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1604)      is why compilers reload variables.  Doing so is perfectly safe for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1605)      single-threaded code, so you need to tell the compiler about cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1606)      where it is not safe.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1607) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1608)  (*) The compiler is within its rights to omit a load entirely if it knows
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1609)      what the value will be.  For example, if the compiler can prove that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1610)      the value of variable 'a' is always zero, it can optimize this code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1611) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1612) 	while (tmp = a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1613) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1614) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1615)      Into this:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1616) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1617) 	do { } while (0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1618) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1619)      This transformation is a win for single-threaded code because it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1620)      gets rid of a load and a branch.  The problem is that the compiler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1621)      will carry out its proof assuming that the current CPU is the only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1622)      one updating variable 'a'.  If variable 'a' is shared, then the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1623)      compiler's proof will be erroneous.  Use READ_ONCE() to tell the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1624)      compiler that it doesn't know as much as it thinks it does:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1625) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1626) 	while (tmp = READ_ONCE(a))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1627) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1628) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1629)      But please note that the compiler is also closely watching what you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1630)      do with the value after the READ_ONCE().  For example, suppose you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1631)      do the following and MAX is a preprocessor macro with the value 1:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1632) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1633) 	while ((tmp = READ_ONCE(a)) % MAX)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1634) 		do_something_with(tmp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1635) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1636)      Then the compiler knows that the result of the "%" operator applied
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1637)      to MAX will always be zero, again allowing the compiler to optimize
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1638)      the code into near-nonexistence.  (It will still load from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1639)      variable 'a'.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1640) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1641)  (*) Similarly, the compiler is within its rights to omit a store entirely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1642)      if it knows that the variable already has the value being stored.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1643)      Again, the compiler assumes that the current CPU is the only one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1644)      storing into the variable, which can cause the compiler to do the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1645)      wrong thing for shared variables.  For example, suppose you have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1646)      the following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1647) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1648) 	a = 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1649) 	... Code that does not store to variable a ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1650) 	a = 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1651) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1652)      The compiler sees that the value of variable 'a' is already zero, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1653)      it might well omit the second store.  This would come as a fatal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1654)      surprise if some other CPU might have stored to variable 'a' in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1655)      meantime.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1656) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1657)      Use WRITE_ONCE() to prevent the compiler from making this sort of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1658)      wrong guess:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1659) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1660) 	WRITE_ONCE(a, 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1661) 	... Code that does not store to variable a ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1662) 	WRITE_ONCE(a, 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1663) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1664)  (*) The compiler is within its rights to reorder memory accesses unless
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1665)      you tell it not to.  For example, consider the following interaction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1666)      between process-level code and an interrupt handler:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1667) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1668) 	void process_level(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1669) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1670) 		msg = get_message();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1671) 		flag = true;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1672) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1673) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1674) 	void interrupt_handler(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1675) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1676) 		if (flag)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1677) 			process_message(msg);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1678) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1679) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1680)      There is nothing to prevent the compiler from transforming
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1681)      process_level() to the following, in fact, this might well be a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1682)      win for single-threaded code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1683) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1684) 	void process_level(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1685) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1686) 		flag = true;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1687) 		msg = get_message();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1688) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1689) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1690)      If the interrupt occurs between these two statement, then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1691)      interrupt_handler() might be passed a garbled msg.  Use WRITE_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1692)      to prevent this as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1693) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1694) 	void process_level(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1695) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1696) 		WRITE_ONCE(msg, get_message());
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1697) 		WRITE_ONCE(flag, true);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1698) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1699) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1700) 	void interrupt_handler(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1701) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1702) 		if (READ_ONCE(flag))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1703) 			process_message(READ_ONCE(msg));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1704) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1705) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1706)      Note that the READ_ONCE() and WRITE_ONCE() wrappers in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1707)      interrupt_handler() are needed if this interrupt handler can itself
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1708)      be interrupted by something that also accesses 'flag' and 'msg',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1709)      for example, a nested interrupt or an NMI.  Otherwise, READ_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1710)      and WRITE_ONCE() are not needed in interrupt_handler() other than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1711)      for documentation purposes.  (Note also that nested interrupts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1712)      do not typically occur in modern Linux kernels, in fact, if an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1713)      interrupt handler returns with interrupts enabled, you will get a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1714)      WARN_ONCE() splat.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1715) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1716)      You should assume that the compiler can move READ_ONCE() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1717)      WRITE_ONCE() past code not containing READ_ONCE(), WRITE_ONCE(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1718)      barrier(), or similar primitives.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1719) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1720)      This effect could also be achieved using barrier(), but READ_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1721)      and WRITE_ONCE() are more selective:  With READ_ONCE() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1722)      WRITE_ONCE(), the compiler need only forget the contents of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1723)      indicated memory locations, while with barrier() the compiler must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1724)      discard the value of all memory locations that it has currently
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1725)      cached in any machine registers.  Of course, the compiler must also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1726)      respect the order in which the READ_ONCE()s and WRITE_ONCE()s occur,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1727)      though the CPU of course need not do so.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1728) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1729)  (*) The compiler is within its rights to invent stores to a variable,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1730)      as in the following example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1731) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1732) 	if (a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1733) 		b = a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1734) 	else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1735) 		b = 42;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1736) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1737)      The compiler might save a branch by optimizing this as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1738) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1739) 	b = 42;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1740) 	if (a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1741) 		b = a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1742) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1743)      In single-threaded code, this is not only safe, but also saves
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1744)      a branch.  Unfortunately, in concurrent code, this optimization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1745)      could cause some other CPU to see a spurious value of 42 -- even
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1746)      if variable 'a' was never zero -- when loading variable 'b'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1747)      Use WRITE_ONCE() to prevent this as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1748) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1749) 	if (a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1750) 		WRITE_ONCE(b, a);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1751) 	else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1752) 		WRITE_ONCE(b, 42);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1753) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1754)      The compiler can also invent loads.  These are usually less
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1755)      damaging, but they can result in cache-line bouncing and thus in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1756)      poor performance and scalability.  Use READ_ONCE() to prevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1757)      invented loads.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1758) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1759)  (*) For aligned memory locations whose size allows them to be accessed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1760)      with a single memory-reference instruction, prevents "load tearing"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1761)      and "store tearing," in which a single large access is replaced by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1762)      multiple smaller accesses.  For example, given an architecture having
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1763)      16-bit store instructions with 7-bit immediate fields, the compiler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1764)      might be tempted to use two 16-bit store-immediate instructions to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1765)      implement the following 32-bit store:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1766) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1767) 	p = 0x00010002;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1768) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1769)      Please note that GCC really does use this sort of optimization,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1770)      which is not surprising given that it would likely take more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1771)      than two instructions to build the constant and then store it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1772)      This optimization can therefore be a win in single-threaded code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1773)      In fact, a recent bug (since fixed) caused GCC to incorrectly use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1774)      this optimization in a volatile store.  In the absence of such bugs,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1775)      use of WRITE_ONCE() prevents store tearing in the following example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1776) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1777) 	WRITE_ONCE(p, 0x00010002);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1778) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1779)      Use of packed structures can also result in load and store tearing,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1780)      as in this example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1781) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1782) 	struct __attribute__((__packed__)) foo {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1783) 		short a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1784) 		int b;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1785) 		short c;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1786) 	};
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1787) 	struct foo foo1, foo2;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1788) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1789) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1790) 	foo2.a = foo1.a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1791) 	foo2.b = foo1.b;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1792) 	foo2.c = foo1.c;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1793) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1794)      Because there are no READ_ONCE() or WRITE_ONCE() wrappers and no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1795)      volatile markings, the compiler would be well within its rights to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1796)      implement these three assignment statements as a pair of 32-bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1797)      loads followed by a pair of 32-bit stores.  This would result in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1798)      load tearing on 'foo1.b' and store tearing on 'foo2.b'.  READ_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1799)      and WRITE_ONCE() again prevent tearing in this example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1800) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1801) 	foo2.a = foo1.a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1802) 	WRITE_ONCE(foo2.b, READ_ONCE(foo1.b));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1803) 	foo2.c = foo1.c;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1804) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1805) All that aside, it is never necessary to use READ_ONCE() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1806) WRITE_ONCE() on a variable that has been marked volatile.  For example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1807) because 'jiffies' is marked volatile, it is never necessary to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1808) say READ_ONCE(jiffies).  The reason for this is that READ_ONCE() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1809) WRITE_ONCE() are implemented as volatile casts, which has no effect when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1810) its argument is already marked volatile.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1811) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1812) Please note that these compiler barriers have no direct effect on the CPU,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1813) which may then reorder things however it wishes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1814) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1815) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1816) CPU MEMORY BARRIERS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1817) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1818) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1819) The Linux kernel has eight basic CPU memory barriers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1820) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1821) 	TYPE		MANDATORY		SMP CONDITIONAL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1822) 	===============	=======================	===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1823) 	GENERAL		mb()			smp_mb()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1824) 	WRITE		wmb()			smp_wmb()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1825) 	READ		rmb()			smp_rmb()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1826) 	DATA DEPENDENCY				READ_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1827) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1828) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1829) All memory barriers except the data dependency barriers imply a compiler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1830) barrier.  Data dependencies do not impose any additional compiler ordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1831) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1832) Aside: In the case of data dependencies, the compiler would be expected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1833) to issue the loads in the correct order (eg. `a[b]` would have to load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1834) the value of b before loading a[b]), however there is no guarantee in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1835) the C specification that the compiler may not speculate the value of b
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1836) (eg. is equal to 1) and load a[b] before b (eg. tmp = a[1]; if (b != 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1837) tmp = a[b]; ).  There is also the problem of a compiler reloading b after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1838) having loaded a[b], thus having a newer copy of b than a[b].  A consensus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1839) has not yet been reached about these problems, however the READ_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1840) macro is a good place to start looking.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1841) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1842) SMP memory barriers are reduced to compiler barriers on uniprocessor compiled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1843) systems because it is assumed that a CPU will appear to be self-consistent,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1844) and will order overlapping accesses correctly with respect to itself.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1845) However, see the subsection on "Virtual Machine Guests" below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1846) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1847) [!] Note that SMP memory barriers _must_ be used to control the ordering of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1848) references to shared memory on SMP systems, though the use of locking instead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1849) is sufficient.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1850) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1851) Mandatory barriers should not be used to control SMP effects, since mandatory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1852) barriers impose unnecessary overhead on both SMP and UP systems. They may,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1853) however, be used to control MMIO effects on accesses through relaxed memory I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1854) windows.  These barriers are required even on non-SMP systems as they affect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1855) the order in which memory operations appear to a device by prohibiting both the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1856) compiler and the CPU from reordering them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1857) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1858) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1859) There are some more advanced barrier functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1860) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1861)  (*) smp_store_mb(var, value)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1862) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1863)      This assigns the value to the variable and then inserts a full memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1864)      barrier after it.  It isn't guaranteed to insert anything more than a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1865)      compiler barrier in a UP compilation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1866) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1867) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1868)  (*) smp_mb__before_atomic();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1869)  (*) smp_mb__after_atomic();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1870) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1871)      These are for use with atomic RMW functions that do not imply memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1872)      barriers, but where the code needs a memory barrier. Examples for atomic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1873)      RMW functions that do not imply are memory barrier are e.g. add,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1874)      subtract, (failed) conditional operations, _relaxed functions,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1875)      but not atomic_read or atomic_set. A common example where a memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1876)      barrier may be required is when atomic ops are used for reference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1877)      counting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1878) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1879)      These are also used for atomic RMW bitop functions that do not imply a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1880)      memory barrier (such as set_bit and clear_bit).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1881) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1882)      As an example, consider a piece of code that marks an object as being dead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1883)      and then decrements the object's reference count:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1884) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1885) 	obj->dead = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1886) 	smp_mb__before_atomic();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1887) 	atomic_dec(&obj->ref_count);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1888) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1889)      This makes sure that the death mark on the object is perceived to be set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1890)      *before* the reference counter is decremented.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1891) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1892)      See Documentation/atomic_{t,bitops}.txt for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1893) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1894) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1895)  (*) dma_wmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1896)  (*) dma_rmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1897) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1898)      These are for use with consistent memory to guarantee the ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1899)      of writes or reads of shared memory accessible to both the CPU and a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1900)      DMA capable device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1901) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1902)      For example, consider a device driver that shares memory with a device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1903)      and uses a descriptor status value to indicate if the descriptor belongs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1904)      to the device or the CPU, and a doorbell to notify it when new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1905)      descriptors are available:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1906) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1907) 	if (desc->status != DEVICE_OWN) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1908) 		/* do not read data until we own descriptor */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1909) 		dma_rmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1910) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1911) 		/* read/modify data */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1912) 		read_data = desc->data;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1913) 		desc->data = write_data;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1914) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1915) 		/* flush modifications before status update */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1916) 		dma_wmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1917) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1918) 		/* assign ownership */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1919) 		desc->status = DEVICE_OWN;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1920) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1921) 		/* notify device of new descriptors */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1922) 		writel(DESC_NOTIFY, doorbell);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1923) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1924) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1925)      The dma_rmb() allows us guarantee the device has released ownership
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1926)      before we read the data from the descriptor, and the dma_wmb() allows
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1927)      us to guarantee the data is written to the descriptor before the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1928)      can see it now has ownership.  Note that, when using writel(), a prior
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1929)      wmb() is not needed to guarantee that the cache coherent memory writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1930)      have completed before writing to the MMIO region.  The cheaper
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1931)      writel_relaxed() does not provide this guarantee and must not be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1932)      here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1933) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1934)      See the subsection "Kernel I/O barrier effects" for more information on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1935)      relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1936)      more information on consistent memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1937) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1938)  (*) pmem_wmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1939) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1940)      This is for use with persistent memory to ensure that stores for which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1941)      modifications are written to persistent storage reached a platform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1942)      durability domain.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1943) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1944)      For example, after a non-temporal write to pmem region, we use pmem_wmb()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1945)      to ensure that stores have reached a platform durability domain. This ensures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1946)      that stores have updated persistent storage before any data access or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1947)      data transfer caused by subsequent instructions is initiated. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1948)      in addition to the ordering done by wmb().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1949) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1950)      For load from persistent memory, existing read memory barriers are sufficient
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1951)      to ensure read ordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1952) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1953) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1954) IMPLICIT KERNEL MEMORY BARRIERS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1955) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1956) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1957) Some of the other functions in the linux kernel imply memory barriers, amongst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1958) which are locking and scheduling functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1959) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1960) This specification is a _minimum_ guarantee; any particular architecture may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1961) provide more substantial guarantees, but these may not be relied upon outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1962) of arch specific code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1963) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1964) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1965) LOCK ACQUISITION FUNCTIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1966) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1967) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1968) The Linux kernel has a number of locking constructs:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1969) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1970)  (*) spin locks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1971)  (*) R/W spin locks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1972)  (*) mutexes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1973)  (*) semaphores
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1974)  (*) R/W semaphores
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1975) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1976) In all cases there are variants on "ACQUIRE" operations and "RELEASE" operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1977) for each construct.  These operations all imply certain barriers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1978) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1979)  (1) ACQUIRE operation implication:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1980) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1981)      Memory operations issued after the ACQUIRE will be completed after the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1982)      ACQUIRE operation has completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1983) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1984)      Memory operations issued before the ACQUIRE may be completed after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1985)      the ACQUIRE operation has completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1986) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1987)  (2) RELEASE operation implication:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1988) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1989)      Memory operations issued before the RELEASE will be completed before the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1990)      RELEASE operation has completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1991) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1992)      Memory operations issued after the RELEASE may be completed before the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1993)      RELEASE operation has completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1994) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1995)  (3) ACQUIRE vs ACQUIRE implication:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1996) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1997)      All ACQUIRE operations issued before another ACQUIRE operation will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1998)      completed before that ACQUIRE operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1999) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2000)  (4) ACQUIRE vs RELEASE implication:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2001) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2002)      All ACQUIRE operations issued before a RELEASE operation will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2003)      completed before the RELEASE operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2004) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2005)  (5) Failed conditional ACQUIRE implication:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2006) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2007)      Certain locking variants of the ACQUIRE operation may fail, either due to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2008)      being unable to get the lock immediately, or due to receiving an unblocked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2009)      signal while asleep waiting for the lock to become available.  Failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2010)      locks do not imply any sort of barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2011) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2012) [!] Note: one of the consequences of lock ACQUIREs and RELEASEs being only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2013) one-way barriers is that the effects of instructions outside of a critical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2014) section may seep into the inside of the critical section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2015) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2016) An ACQUIRE followed by a RELEASE may not be assumed to be full memory barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2017) because it is possible for an access preceding the ACQUIRE to happen after the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2018) ACQUIRE, and an access following the RELEASE to happen before the RELEASE, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2019) the two accesses can themselves then cross:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2020) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2021) 	*A = a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2022) 	ACQUIRE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2023) 	RELEASE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2024) 	*B = b;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2025) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2026) may occur as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2027) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2028) 	ACQUIRE M, STORE *B, STORE *A, RELEASE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2029) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2030) When the ACQUIRE and RELEASE are a lock acquisition and release,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2031) respectively, this same reordering can occur if the lock's ACQUIRE and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2032) RELEASE are to the same lock variable, but only from the perspective of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2033) another CPU not holding that lock.  In short, a ACQUIRE followed by an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2034) RELEASE may -not- be assumed to be a full memory barrier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2035) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2036) Similarly, the reverse case of a RELEASE followed by an ACQUIRE does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2037) not imply a full memory barrier.  Therefore, the CPU's execution of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2038) critical sections corresponding to the RELEASE and the ACQUIRE can cross,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2039) so that:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2040) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2041) 	*A = a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2042) 	RELEASE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2043) 	ACQUIRE N
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2044) 	*B = b;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2045) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2046) could occur as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2047) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2048) 	ACQUIRE N, STORE *B, STORE *A, RELEASE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2049) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2050) It might appear that this reordering could introduce a deadlock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2051) However, this cannot happen because if such a deadlock threatened,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2052) the RELEASE would simply complete, thereby avoiding the deadlock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2053) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2054) 	Why does this work?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2055) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2056) 	One key point is that we are only talking about the CPU doing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2057) 	the reordering, not the compiler.  If the compiler (or, for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2058) 	that matter, the developer) switched the operations, deadlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2059) 	-could- occur.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2060) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2061) 	But suppose the CPU reordered the operations.  In this case,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2062) 	the unlock precedes the lock in the assembly code.  The CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2063) 	simply elected to try executing the later lock operation first.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2064) 	If there is a deadlock, this lock operation will simply spin (or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2065) 	try to sleep, but more on that later).	The CPU will eventually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2066) 	execute the unlock operation (which preceded the lock operation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2067) 	in the assembly code), which will unravel the potential deadlock,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2068) 	allowing the lock operation to succeed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2069) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2070) 	But what if the lock is a sleeplock?  In that case, the code will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2071) 	try to enter the scheduler, where it will eventually encounter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2072) 	a memory barrier, which will force the earlier unlock operation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2073) 	to complete, again unraveling the deadlock.  There might be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2074) 	a sleep-unlock race, but the locking primitive needs to resolve
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2075) 	such races properly in any case.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2076) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2077) Locks and semaphores may not provide any guarantee of ordering on UP compiled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2078) systems, and so cannot be counted on in such a situation to actually achieve
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2079) anything at all - especially with respect to I/O accesses - unless combined
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2080) with interrupt disabling operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2081) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2082) See also the section on "Inter-CPU acquiring barrier effects".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2083) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2084) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2085) As an example, consider the following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2086) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2087) 	*A = a;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2088) 	*B = b;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2089) 	ACQUIRE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2090) 	*C = c;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2091) 	*D = d;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2092) 	RELEASE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2093) 	*E = e;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2094) 	*F = f;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2095) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2096) The following sequence of events is acceptable:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2097) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2098) 	ACQUIRE, {*F,*A}, *E, {*C,*D}, *B, RELEASE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2099) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2100) 	[+] Note that {*F,*A} indicates a combined access.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2102) But none of the following are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2104) 	{*F,*A}, *B,	ACQUIRE, *C, *D,	RELEASE, *E
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2105) 	*A, *B, *C,	ACQUIRE, *D,		RELEASE, *E, *F
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2106) 	*A, *B,		ACQUIRE, *C,		RELEASE, *D, *E, *F
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2107) 	*B,		ACQUIRE, *C, *D,	RELEASE, {*F,*A}, *E
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2108) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2111) INTERRUPT DISABLING FUNCTIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2112) -----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2114) Functions that disable interrupts (ACQUIRE equivalent) and enable interrupts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2115) (RELEASE equivalent) will act as compiler barriers only.  So if memory or I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2116) barriers are required in such a situation, they must be provided from some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2117) other means.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2118) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2119) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2120) SLEEP AND WAKE-UP FUNCTIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2121) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2122) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2123) Sleeping and waking on an event flagged in global data can be viewed as an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2124) interaction between two pieces of data: the task state of the task waiting for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2125) the event and the global data used to indicate the event.  To make sure that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2126) these appear to happen in the right order, the primitives to begin the process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2127) of going to sleep, and the primitives to initiate a wake up imply certain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2128) barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2129) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2130) Firstly, the sleeper normally follows something like this sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2132) 	for (;;) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2133) 		set_current_state(TASK_UNINTERRUPTIBLE);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2134) 		if (event_indicated)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2135) 			break;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2136) 		schedule();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2137) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2138) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2139) A general memory barrier is interpolated automatically by set_current_state()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2140) after it has altered the task state:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2142) 	CPU 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2143) 	===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2144) 	set_current_state();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2145) 	  smp_store_mb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2146) 	    STORE current->state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2147) 	    <general barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2148) 	LOAD event_indicated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2150) set_current_state() may be wrapped by:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2151) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2152) 	prepare_to_wait();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2153) 	prepare_to_wait_exclusive();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2154) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2155) which therefore also imply a general memory barrier after setting the state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2156) The whole sequence above is available in various canned forms, all of which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2157) interpolate the memory barrier in the right place:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2159) 	wait_event();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2160) 	wait_event_interruptible();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2161) 	wait_event_interruptible_exclusive();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2162) 	wait_event_interruptible_timeout();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2163) 	wait_event_killable();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2164) 	wait_event_timeout();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2165) 	wait_on_bit();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2166) 	wait_on_bit_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2167) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2168) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2169) Secondly, code that performs a wake up normally follows something like this:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2170) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2171) 	event_indicated = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2172) 	wake_up(&event_wait_queue);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2174) or:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2176) 	event_indicated = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2177) 	wake_up_process(event_daemon);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2178) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2179) A general memory barrier is executed by wake_up() if it wakes something up.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2180) If it doesn't wake anything up then a memory barrier may or may not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2181) executed; you must not rely on it.  The barrier occurs before the task state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2182) is accessed, in particular, it sits between the STORE to indicate the event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2183) and the STORE to set TASK_RUNNING:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2184) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2185) 	CPU 1 (Sleeper)			CPU 2 (Waker)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2186) 	===============================	===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2187) 	set_current_state();		STORE event_indicated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2188) 	  smp_store_mb();		wake_up();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2189) 	    STORE current->state	  ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2190) 	    <general barrier>		  <general barrier>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2191) 	LOAD event_indicated		  if ((LOAD task->state) & TASK_NORMAL)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2192) 					    STORE task->state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2193) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2194) where "task" is the thread being woken up and it equals CPU 1's "current".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2195) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2196) To repeat, a general memory barrier is guaranteed to be executed by wake_up()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2197) if something is actually awakened, but otherwise there is no such guarantee.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2198) To see this, consider the following sequence of events, where X and Y are both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2199) initially zero:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2200) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2201) 	CPU 1				CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2202) 	===============================	===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2203) 	X = 1;				Y = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2204) 	smp_mb();			wake_up();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2205) 	LOAD Y				LOAD X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2206) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2207) If a wakeup does occur, one (at least) of the two loads must see 1.  If, on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2208) the other hand, a wakeup does not occur, both loads might see 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2210) wake_up_process() always executes a general memory barrier.  The barrier again
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2211) occurs before the task state is accessed.  In particular, if the wake_up() in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2212) the previous snippet were replaced by a call to wake_up_process() then one of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2213) the two loads would be guaranteed to see 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2214) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2215) The available waker functions include:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2216) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2217) 	complete();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2218) 	wake_up();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2219) 	wake_up_all();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2220) 	wake_up_bit();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2221) 	wake_up_interruptible();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2222) 	wake_up_interruptible_all();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2223) 	wake_up_interruptible_nr();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2224) 	wake_up_interruptible_poll();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2225) 	wake_up_interruptible_sync();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2226) 	wake_up_interruptible_sync_poll();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2227) 	wake_up_locked();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2228) 	wake_up_locked_poll();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2229) 	wake_up_nr();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2230) 	wake_up_poll();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2231) 	wake_up_process();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2232) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2233) In terms of memory ordering, these functions all provide the same guarantees of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2234) a wake_up() (or stronger).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2235) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2236) [!] Note that the memory barriers implied by the sleeper and the waker do _not_
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2237) order multiple stores before the wake-up with respect to loads of those stored
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2238) values after the sleeper has called set_current_state().  For instance, if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2239) sleeper does:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2240) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2241) 	set_current_state(TASK_INTERRUPTIBLE);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2242) 	if (event_indicated)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2243) 		break;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2244) 	__set_current_state(TASK_RUNNING);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2245) 	do_something(my_data);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2246) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2247) and the waker does:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2248) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2249) 	my_data = value;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2250) 	event_indicated = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2251) 	wake_up(&event_wait_queue);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2252) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2253) there's no guarantee that the change to event_indicated will be perceived by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2254) the sleeper as coming after the change to my_data.  In such a circumstance, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2255) code on both sides must interpolate its own memory barriers between the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2256) separate data accesses.  Thus the above sleeper ought to do:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2257) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2258) 	set_current_state(TASK_INTERRUPTIBLE);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2259) 	if (event_indicated) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2260) 		smp_rmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2261) 		do_something(my_data);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2262) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2264) and the waker should do:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2265) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2266) 	my_data = value;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2267) 	smp_wmb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2268) 	event_indicated = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2269) 	wake_up(&event_wait_queue);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2270) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2271) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2272) MISCELLANEOUS FUNCTIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2273) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2274) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2275) Other functions that imply barriers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2277)  (*) schedule() and similar imply full memory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2278) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2279) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2280) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2281) INTER-CPU ACQUIRING BARRIER EFFECTS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2282) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2283) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2284) On SMP systems locking primitives give a more substantial form of barrier: one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2285) that does affect memory access ordering on other CPUs, within the context of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2286) conflict on any particular lock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2287) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2288) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2289) ACQUIRES VS MEMORY ACCESSES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2290) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2291) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2292) Consider the following: the system has a pair of spinlocks (M) and (Q), and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2293) three CPUs; then should the following sequence of events occur:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2294) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2295) 	CPU 1				CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2296) 	===============================	===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2297) 	WRITE_ONCE(*A, a);		WRITE_ONCE(*E, e);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2298) 	ACQUIRE M			ACQUIRE Q
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2299) 	WRITE_ONCE(*B, b);		WRITE_ONCE(*F, f);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2300) 	WRITE_ONCE(*C, c);		WRITE_ONCE(*G, g);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2301) 	RELEASE M			RELEASE Q
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2302) 	WRITE_ONCE(*D, d);		WRITE_ONCE(*H, h);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2303) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2304) Then there is no guarantee as to what order CPU 3 will see the accesses to *A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2305) through *H occur in, other than the constraints imposed by the separate locks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2306) on the separate CPUs.  It might, for example, see:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2307) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2308) 	*E, ACQUIRE M, ACQUIRE Q, *G, *C, *F, *A, *B, RELEASE Q, *D, *H, RELEASE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2309) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2310) But it won't see any of:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2311) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2312) 	*B, *C or *D preceding ACQUIRE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2313) 	*A, *B or *C following RELEASE M
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2314) 	*F, *G or *H preceding ACQUIRE Q
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2315) 	*E, *F or *G following RELEASE Q
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2316) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2317) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2318) =================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2319) WHERE ARE MEMORY BARRIERS NEEDED?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2320) =================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2321) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2322) Under normal operation, memory operation reordering is generally not going to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2323) be a problem as a single-threaded linear piece of code will still appear to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2324) work correctly, even if it's in an SMP kernel.  There are, however, four
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2325) circumstances in which reordering definitely _could_ be a problem:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2326) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2327)  (*) Interprocessor interaction.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2328) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2329)  (*) Atomic operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2330) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2331)  (*) Accessing devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2332) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2333)  (*) Interrupts.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2334) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2335) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2336) INTERPROCESSOR INTERACTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2337) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2338) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2339) When there's a system with more than one processor, more than one CPU in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2340) system may be working on the same data set at the same time.  This can cause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2341) synchronisation problems, and the usual way of dealing with them is to use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2342) locks.  Locks, however, are quite expensive, and so it may be preferable to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2343) operate without the use of a lock if at all possible.  In such a case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2344) operations that affect both CPUs may have to be carefully ordered to prevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2345) a malfunction.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2346) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2347) Consider, for example, the R/W semaphore slow path.  Here a waiting process is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2348) queued on the semaphore, by virtue of it having a piece of its stack linked to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2349) the semaphore's list of waiting processes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2350) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2351) 	struct rw_semaphore {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2352) 		...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2353) 		spinlock_t lock;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2354) 		struct list_head waiters;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2355) 	};
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2356) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2357) 	struct rwsem_waiter {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2358) 		struct list_head list;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2359) 		struct task_struct *task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2360) 	};
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2361) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2362) To wake up a particular waiter, the up_read() or up_write() functions have to:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2363) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2364)  (1) read the next pointer from this waiter's record to know as to where the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2365)      next waiter record is;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2366) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2367)  (2) read the pointer to the waiter's task structure;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2368) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2369)  (3) clear the task pointer to tell the waiter it has been given the semaphore;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2370) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2371)  (4) call wake_up_process() on the task; and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2372) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2373)  (5) release the reference held on the waiter's task struct.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2374) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2375) In other words, it has to perform this sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2376) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2377) 	LOAD waiter->list.next;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2378) 	LOAD waiter->task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2379) 	STORE waiter->task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2380) 	CALL wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2381) 	RELEASE task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2382) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2383) and if any of these steps occur out of order, then the whole thing may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2384) malfunction.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2385) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2386) Once it has queued itself and dropped the semaphore lock, the waiter does not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2387) get the lock again; it instead just waits for its task pointer to be cleared
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2388) before proceeding.  Since the record is on the waiter's stack, this means that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2389) if the task pointer is cleared _before_ the next pointer in the list is read,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2390) another CPU might start processing the waiter and might clobber the waiter's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2391) stack before the up*() function has a chance to read the next pointer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2392) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2393) Consider then what might happen to the above sequence of events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2394) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2395) 	CPU 1				CPU 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2396) 	===============================	===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2397) 					down_xxx()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2398) 					Queue waiter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2399) 					Sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2400) 	up_yyy()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2401) 	LOAD waiter->task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2402) 	STORE waiter->task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2403) 					Woken up by other event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2404) 	<preempt>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2405) 					Resume processing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2406) 					down_xxx() returns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2407) 					call foo()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2408) 					foo() clobbers *waiter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2409) 	</preempt>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2410) 	LOAD waiter->list.next;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2411) 	--- OOPS ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2412) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2413) This could be dealt with using the semaphore lock, but then the down_xxx()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2414) function has to needlessly get the spinlock again after being woken up.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2415) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2416) The way to deal with this is to insert a general SMP memory barrier:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2417) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2418) 	LOAD waiter->list.next;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2419) 	LOAD waiter->task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2420) 	smp_mb();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2421) 	STORE waiter->task;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2422) 	CALL wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2423) 	RELEASE task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2424) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2425) In this case, the barrier makes a guarantee that all memory accesses before the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2426) barrier will appear to happen before all the memory accesses after the barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2427) with respect to the other CPUs on the system.  It does _not_ guarantee that all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2428) the memory accesses before the barrier will be complete by the time the barrier
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2429) instruction itself is complete.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2430) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2431) On a UP system - where this wouldn't be a problem - the smp_mb() is just a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2432) compiler barrier, thus making sure the compiler emits the instructions in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2433) right order without actually intervening in the CPU.  Since there's only one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2434) CPU, that CPU's dependency ordering logic will take care of everything else.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2435) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2436) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2437) ATOMIC OPERATIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2438) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2439) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2440) While they are technically interprocessor interaction considerations, atomic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2441) operations are noted specially as some of them imply full memory barriers and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2442) some don't, but they're very heavily relied on as a group throughout the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2443) kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2444) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2445) See Documentation/atomic_t.txt for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2446) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2447) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2448) ACCESSING DEVICES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2449) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2450) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2451) Many devices can be memory mapped, and so appear to the CPU as if they're just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2452) a set of memory locations.  To control such a device, the driver usually has to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2453) make the right memory accesses in exactly the right order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2454) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2455) However, having a clever CPU or a clever compiler creates a potential problem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2456) in that the carefully sequenced accesses in the driver code won't reach the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2457) device in the requisite order if the CPU or the compiler thinks it is more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2458) efficient to reorder, combine or merge accesses - something that would cause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2459) the device to malfunction.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2460) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2461) Inside of the Linux kernel, I/O should be done through the appropriate accessor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2462) routines - such as inb() or writel() - which know how to make such accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2463) appropriately sequential.  While this, for the most part, renders the explicit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2464) use of memory barriers unnecessary, if the accessor functions are used to refer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2465) to an I/O memory window with relaxed memory access properties, then _mandatory_
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2466) memory barriers are required to enforce ordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2467) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2468) See Documentation/driver-api/device-io.rst for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2469) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2470) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2471) INTERRUPTS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2472) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2473) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2474) A driver may be interrupted by its own interrupt service routine, and thus the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2475) two parts of the driver may interfere with each other's attempts to control or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2476) access the device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2477) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2478) This may be alleviated - at least in part - by disabling local interrupts (a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2479) form of locking), such that the critical operations are all contained within
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2480) the interrupt-disabled section in the driver.  While the driver's interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2481) routine is executing, the driver's core may not run on the same CPU, and its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2482) interrupt is not permitted to happen again until the current interrupt has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2483) handled, thus the interrupt handler does not need to lock against that.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2484) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2485) However, consider a driver that was talking to an ethernet card that sports an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2486) address register and a data register.  If that driver's core talks to the card
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2487) under interrupt-disablement and then the driver's interrupt handler is invoked:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2488) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2489) 	LOCAL IRQ DISABLE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2490) 	writew(ADDR, 3);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2491) 	writew(DATA, y);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2492) 	LOCAL IRQ ENABLE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2493) 	<interrupt>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2494) 	writew(ADDR, 4);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2495) 	q = readw(DATA);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2496) 	</interrupt>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2497) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2498) The store to the data register might happen after the second store to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2499) address register if ordering rules are sufficiently relaxed:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2500) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2501) 	STORE *ADDR = 3, STORE *ADDR = 4, STORE *DATA = y, q = LOAD *DATA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2502) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2503) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2504) If ordering rules are relaxed, it must be assumed that accesses done inside an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2505) interrupt disabled section may leak outside of it and may interleave with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2506) accesses performed in an interrupt - and vice versa - unless implicit or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2507) explicit barriers are used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2508) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2509) Normally this won't be a problem because the I/O accesses done inside such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2510) sections will include synchronous load operations on strictly ordered I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2511) registers that form implicit I/O barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2512) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2513) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2514) A similar situation may occur between an interrupt routine and two routines
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2515) running on separate CPUs that communicate with each other.  If such a case is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2516) likely, then interrupt-disabling locks should be used to guarantee ordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2517) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2518) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2519) ==========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2520) KERNEL I/O BARRIER EFFECTS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2521) ==========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2522) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2523) Interfacing with peripherals via I/O accesses is deeply architecture and device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2524) specific. Therefore, drivers which are inherently non-portable may rely on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2525) specific behaviours of their target systems in order to achieve synchronization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2526) in the most lightweight manner possible. For drivers intending to be portable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2527) between multiple architectures and bus implementations, the kernel offers a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2528) series of accessor functions that provide various degrees of ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2529) guarantees:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2530) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2531)  (*) readX(), writeX():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2532) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2533) 	The readX() and writeX() MMIO accessors take a pointer to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2534) 	peripheral being accessed as an __iomem * parameter. For pointers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2535) 	mapped with the default I/O attributes (e.g. those returned by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2536) 	ioremap()), the ordering guarantees are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2537) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2538) 	1. All readX() and writeX() accesses to the same peripheral are ordered
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2539) 	   with respect to each other. This ensures that MMIO register accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2540) 	   by the same CPU thread to a particular device will arrive in program
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2541) 	   order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2542) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2543) 	2. A writeX() issued by a CPU thread holding a spinlock is ordered
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2544) 	   before a writeX() to the same peripheral from another CPU thread
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2545) 	   issued after a later acquisition of the same spinlock. This ensures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2546) 	   that MMIO register writes to a particular device issued while holding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2547) 	   a spinlock will arrive in an order consistent with acquisitions of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2548) 	   the lock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2549) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2550) 	3. A writeX() by a CPU thread to the peripheral will first wait for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2551) 	   completion of all prior writes to memory either issued by, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2552) 	   propagated to, the same thread. This ensures that writes by the CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2553) 	   to an outbound DMA buffer allocated by dma_alloc_coherent() will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2554) 	   visible to a DMA engine when the CPU writes to its MMIO control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2555) 	   register to trigger the transfer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2556) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2557) 	4. A readX() by a CPU thread from the peripheral will complete before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2558) 	   any subsequent reads from memory by the same thread can begin. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2559) 	   ensures that reads by the CPU from an incoming DMA buffer allocated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2560) 	   by dma_alloc_coherent() will not see stale data after reading from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2561) 	   the DMA engine's MMIO status register to establish that the DMA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2562) 	   transfer has completed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2563) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2564) 	5. A readX() by a CPU thread from the peripheral will complete before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2565) 	   any subsequent delay() loop can begin execution on the same thread.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2566) 	   This ensures that two MMIO register writes by the CPU to a peripheral
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2567) 	   will arrive at least 1us apart if the first write is immediately read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2568) 	   back with readX() and udelay(1) is called prior to the second
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2569) 	   writeX():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2570) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2571) 		writel(42, DEVICE_REGISTER_0); // Arrives at the device...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2572) 		readl(DEVICE_REGISTER_0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2573) 		udelay(1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2574) 		writel(42, DEVICE_REGISTER_1); // ...at least 1us before this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2575) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2576) 	The ordering properties of __iomem pointers obtained with non-default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2577) 	attributes (e.g. those returned by ioremap_wc()) are specific to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2578) 	underlying architecture and therefore the guarantees listed above cannot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2579) 	generally be relied upon for accesses to these types of mappings.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2580) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2581)  (*) readX_relaxed(), writeX_relaxed():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2582) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2583) 	These are similar to readX() and writeX(), but provide weaker memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2584) 	ordering guarantees. Specifically, they do not guarantee ordering with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2585) 	respect to locking, normal memory accesses or delay() loops (i.e.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2586) 	bullets 2-5 above) but they are still guaranteed to be ordered with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2587) 	respect to other accesses from the same CPU thread to the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2588) 	peripheral when operating on __iomem pointers mapped with the default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2589) 	I/O attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2590) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2591)  (*) readsX(), writesX():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2592) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2593) 	The readsX() and writesX() MMIO accessors are designed for accessing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2594) 	register-based, memory-mapped FIFOs residing on peripherals that are not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2595) 	capable of performing DMA. Consequently, they provide only the ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2596) 	guarantees of readX_relaxed() and writeX_relaxed(), as documented above.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2597) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2598)  (*) inX(), outX():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2599) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2600) 	The inX() and outX() accessors are intended to access legacy port-mapped
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2601) 	I/O peripherals, which may require special instructions on some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2602) 	architectures (notably x86). The port number of the peripheral being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2603) 	accessed is passed as an argument.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2604) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2605) 	Since many CPU architectures ultimately access these peripherals via an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2606) 	internal virtual memory mapping, the portable ordering guarantees
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2607) 	provided by inX() and outX() are the same as those provided by readX()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2608) 	and writeX() respectively when accessing a mapping with the default I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2609) 	attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2610) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2611) 	Device drivers may expect outX() to emit a non-posted write transaction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2612) 	that waits for a completion response from the I/O peripheral before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2613) 	returning. This is not guaranteed by all architectures and is therefore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2614) 	not part of the portable ordering semantics.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2615) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2616)  (*) insX(), outsX():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2617) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2618) 	As above, the insX() and outsX() accessors provide the same ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2619) 	guarantees as readsX() and writesX() respectively when accessing a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2620) 	mapping with the default I/O attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2621) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2622)  (*) ioreadX(), iowriteX():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2623) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2624) 	These will perform appropriately for the type of access they're actually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2625) 	doing, be it inX()/outX() or readX()/writeX().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2626) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2627) With the exception of the string accessors (insX(), outsX(), readsX() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2628) writesX()), all of the above assume that the underlying peripheral is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2629) little-endian and will therefore perform byte-swapping operations on big-endian
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2630) architectures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2631) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2632) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2633) ========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2634) ASSUMED MINIMUM EXECUTION ORDERING MODEL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2635) ========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2636) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2637) It has to be assumed that the conceptual CPU is weakly-ordered but that it will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2638) maintain the appearance of program causality with respect to itself.  Some CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2639) (such as i386 or x86_64) are more constrained than others (such as powerpc or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2640) frv), and so the most relaxed case (namely DEC Alpha) must be assumed outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2641) of arch-specific code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2642) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2643) This means that it must be considered that the CPU will execute its instruction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2644) stream in any order it feels like - or even in parallel - provided that if an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2645) instruction in the stream depends on an earlier instruction, then that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2646) earlier instruction must be sufficiently complete[*] before the later
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2647) instruction may proceed; in other words: provided that the appearance of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2648) causality is maintained.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2649) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2650)  [*] Some instructions have more than one effect - such as changing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2651)      condition codes, changing registers or changing memory - and different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2652)      instructions may depend on different effects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2653) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2654) A CPU may also discard any instruction sequence that winds up having no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2655) ultimate effect.  For example, if two adjacent instructions both load an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2656) immediate value into the same register, the first may be discarded.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2657) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2658) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2659) Similarly, it has to be assumed that compiler might reorder the instruction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2660) stream in any way it sees fit, again provided the appearance of causality is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2661) maintained.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2662) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2663) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2664) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2665) THE EFFECTS OF THE CPU CACHE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2666) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2667) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2668) The way cached memory operations are perceived across the system is affected to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2669) a certain extent by the caches that lie between CPUs and memory, and by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2670) memory coherence system that maintains the consistency of state in the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2671) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2672) As far as the way a CPU interacts with another part of the system through the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2673) caches goes, the memory system has to include the CPU's caches, and memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2674) barriers for the most part act at the interface between the CPU and its cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2675) (memory barriers logically act on the dotted line in the following diagram):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2676) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2677) 	    <--- CPU --->         :       <----------- Memory ----------->
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2678) 	                          :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2679) 	+--------+    +--------+  :   +--------+    +-----------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2680) 	|        |    |        |  :   |        |    |           |    +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2681) 	|  CPU   |    | Memory |  :   | CPU    |    |           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2682) 	|  Core  |--->| Access |----->| Cache  |<-->|           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2683) 	|        |    | Queue  |  :   |        |    |           |--->| Memory |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2684) 	|        |    |        |  :   |        |    |           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2685) 	+--------+    +--------+  :   +--------+    |           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2686) 	                          :                 | Cache     |    +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2687) 	                          :                 | Coherency |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2688) 	                          :                 | Mechanism |    +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2689) 	+--------+    +--------+  :   +--------+    |           |    |	      |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2690) 	|        |    |        |  :   |        |    |           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2691) 	|  CPU   |    | Memory |  :   | CPU    |    |           |--->| Device |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2692) 	|  Core  |--->| Access |----->| Cache  |<-->|           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2693) 	|        |    | Queue  |  :   |        |    |           |    |        |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2694) 	|        |    |        |  :   |        |    |           |    +--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2695) 	+--------+    +--------+  :   +--------+    +-----------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2696) 	                          :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2697) 	                          :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2698) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2699) Although any particular load or store may not actually appear outside of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2700) CPU that issued it since it may have been satisfied within the CPU's own cache,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2701) it will still appear as if the full memory access had taken place as far as the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2702) other CPUs are concerned since the cache coherency mechanisms will migrate the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2703) cacheline over to the accessing CPU and propagate the effects upon conflict.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2704) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2705) The CPU core may execute instructions in any order it deems fit, provided the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2706) expected program causality appears to be maintained.  Some of the instructions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2707) generate load and store operations which then go into the queue of memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2708) accesses to be performed.  The core may place these in the queue in any order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2709) it wishes, and continue execution until it is forced to wait for an instruction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2710) to complete.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2711) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2712) What memory barriers are concerned with is controlling the order in which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2713) accesses cross from the CPU side of things to the memory side of things, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2714) the order in which the effects are perceived to happen by the other observers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2715) in the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2716) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2717) [!] Memory barriers are _not_ needed within a given CPU, as CPUs always see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2718) their own loads and stores as if they had happened in program order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2719) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2720) [!] MMIO or other device accesses may bypass the cache system.  This depends on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2721) the properties of the memory window through which devices are accessed and/or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2722) the use of any special device communication instructions the CPU may have.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2723) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2724) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2725) CACHE COHERENCY VS DMA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2726) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2727) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2728) Not all systems maintain cache coherency with respect to devices doing DMA.  In
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2729) such cases, a device attempting DMA may obtain stale data from RAM because
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2730) dirty cache lines may be resident in the caches of various CPUs, and may not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2731) have been written back to RAM yet.  To deal with this, the appropriate part of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2732) the kernel must flush the overlapping bits of cache on each CPU (and maybe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2733) invalidate them as well).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2734) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2735) In addition, the data DMA'd to RAM by a device may be overwritten by dirty
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2736) cache lines being written back to RAM from a CPU's cache after the device has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2737) installed its own data, or cache lines present in the CPU's cache may simply
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2738) obscure the fact that RAM has been updated, until at such time as the cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2739) is discarded from the CPU's cache and reloaded.  To deal with this, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2740) appropriate part of the kernel must invalidate the overlapping bits of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2741) cache on each CPU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2742) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2743) See Documentation/core-api/cachetlb.rst for more information on cache management.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2744) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2745) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2746) CACHE COHERENCY VS MMIO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2747) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2748) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2749) Memory mapped I/O usually takes place through memory locations that are part of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2750) a window in the CPU's memory space that has different properties assigned than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2751) the usual RAM directed window.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2752) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2753) Amongst these properties is usually the fact that such accesses bypass the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2754) caching entirely and go directly to the device buses.  This means MMIO accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2755) may, in effect, overtake accesses to cached memory that were emitted earlier.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2756) A memory barrier isn't sufficient in such a case, but rather the cache must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2757) flushed between the cached memory write and the MMIO access if the two are in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2758) any way dependent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2759) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2760) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2761) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2762) THE THINGS CPUS GET UP TO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2763) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2764) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2765) A programmer might take it for granted that the CPU will perform memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2766) operations in exactly the order specified, so that if the CPU is, for example,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2767) given the following piece of code to execute:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2768) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2769) 	a = READ_ONCE(*A);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2770) 	WRITE_ONCE(*B, b);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2771) 	c = READ_ONCE(*C);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2772) 	d = READ_ONCE(*D);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2773) 	WRITE_ONCE(*E, e);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2774) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2775) they would then expect that the CPU will complete the memory operation for each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2776) instruction before moving on to the next one, leading to a definite sequence of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2777) operations as seen by external observers in the system:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2778) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2779) 	LOAD *A, STORE *B, LOAD *C, LOAD *D, STORE *E.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2780) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2781) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2782) Reality is, of course, much messier.  With many CPUs and compilers, the above
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2783) assumption doesn't hold because:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2784) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2785)  (*) loads are more likely to need to be completed immediately to permit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2786)      execution progress, whereas stores can often be deferred without a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2787)      problem;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2788) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2789)  (*) loads may be done speculatively, and the result discarded should it prove
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2790)      to have been unnecessary;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2791) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2792)  (*) loads may be done speculatively, leading to the result having been fetched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2793)      at the wrong time in the expected sequence of events;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2794) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2795)  (*) the order of the memory accesses may be rearranged to promote better use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2796)      of the CPU buses and caches;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2797) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2798)  (*) loads and stores may be combined to improve performance when talking to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2799)      memory or I/O hardware that can do batched accesses of adjacent locations,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2800)      thus cutting down on transaction setup costs (memory and PCI devices may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2801)      both be able to do this); and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2802) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2803)  (*) the CPU's data cache may affect the ordering, and while cache-coherency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2804)      mechanisms may alleviate this - once the store has actually hit the cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2805)      - there's no guarantee that the coherency management will be propagated in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2806)      order to other CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2807) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2808) So what another CPU, say, might actually observe from the above piece of code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2809) is:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2810) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2811) 	LOAD *A, ..., LOAD {*C,*D}, STORE *E, STORE *B
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2812) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2813) 	(Where "LOAD {*C,*D}" is a combined load)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2814) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2815) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2816) However, it is guaranteed that a CPU will be self-consistent: it will see its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2817) _own_ accesses appear to be correctly ordered, without the need for a memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2818) barrier.  For instance with the following code:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2819) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2820) 	U = READ_ONCE(*A);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2821) 	WRITE_ONCE(*A, V);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2822) 	WRITE_ONCE(*A, W);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2823) 	X = READ_ONCE(*A);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2824) 	WRITE_ONCE(*A, Y);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2825) 	Z = READ_ONCE(*A);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2826) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2827) and assuming no intervention by an external influence, it can be assumed that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2828) the final result will appear to be:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2829) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2830) 	U == the original value of *A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2831) 	X == W
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2832) 	Z == Y
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2833) 	*A == Y
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2834) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2835) The code above may cause the CPU to generate the full sequence of memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2836) accesses:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2837) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2838) 	U=LOAD *A, STORE *A=V, STORE *A=W, X=LOAD *A, STORE *A=Y, Z=LOAD *A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2839) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2840) in that order, but, without intervention, the sequence may have almost any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2841) combination of elements combined or discarded, provided the program's view
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2842) of the world remains consistent.  Note that READ_ONCE() and WRITE_ONCE()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2843) are -not- optional in the above example, as there are architectures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2844) where a given CPU might reorder successive loads to the same location.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2845) On such architectures, READ_ONCE() and WRITE_ONCE() do whatever is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2846) necessary to prevent this, for example, on Itanium the volatile casts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2847) used by READ_ONCE() and WRITE_ONCE() cause GCC to emit the special ld.acq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2848) and st.rel instructions (respectively) that prevent such reordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2849) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2850) The compiler may also combine, discard or defer elements of the sequence before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2851) the CPU even sees them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2852) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2853) For instance:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2854) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2855) 	*A = V;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2856) 	*A = W;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2857) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2858) may be reduced to:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2859) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2860) 	*A = W;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2861) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2862) since, without either a write barrier or an WRITE_ONCE(), it can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2863) assumed that the effect of the storage of V to *A is lost.  Similarly:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2864) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2865) 	*A = Y;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2866) 	Z = *A;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2867) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2868) may, without a memory barrier or an READ_ONCE() and WRITE_ONCE(), be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2869) reduced to:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2870) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2871) 	*A = Y;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2872) 	Z = Y;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2873) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2874) and the LOAD operation never appear outside of the CPU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2875) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2876) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2877) AND THEN THERE'S THE ALPHA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2878) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2879) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2880) The DEC Alpha CPU is one of the most relaxed CPUs there is.  Not only that,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2881) some versions of the Alpha CPU have a split data cache, permitting them to have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2882) two semantically-related cache lines updated at separate times.  This is where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2883) the data dependency barrier really becomes necessary as this synchronises both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2884) caches with the memory coherence system, thus making it seem like pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2885) changes vs new data occur in the right order.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2886) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2887) The Alpha defines the Linux kernel's memory model, although as of v4.15
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2888) the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2889) reduced its impact on the memory model.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2890) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2891) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2892) VIRTUAL MACHINE GUESTS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2893) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2894) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2895) Guests running within virtual machines might be affected by SMP effects even if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2896) the guest itself is compiled without SMP support.  This is an artifact of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2897) interfacing with an SMP host while running an UP kernel.  Using mandatory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2898) barriers for this use-case would be possible but is often suboptimal.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2899) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2900) To handle this case optimally, low-level virt_mb() etc macros are available.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2901) These have the same effect as smp_mb() etc when SMP is enabled, but generate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2902) identical code for SMP and non-SMP systems.  For example, virtual machine guests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2903) should use virt_mb() rather than smp_mb() when synchronizing against a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2904) (possibly SMP) host.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2905) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2906) These are equivalent to smp_mb() etc counterparts in all other respects,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2907) in particular, they do not control MMIO effects: to control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2908) MMIO effects, use mandatory barriers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2909) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2910) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2911) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2912) EXAMPLE USES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2913) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2914) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2915) CIRCULAR BUFFERS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2916) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2917) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2918) Memory barriers can be used to implement circular buffering without the need
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2919) of a lock to serialise the producer with the consumer.  See:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2920) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2921) 	Documentation/core-api/circular-buffers.rst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2922) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2923) for details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2924) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2925) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2926) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2927) REFERENCES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2928) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2929) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2930) Alpha AXP Architecture Reference Manual, Second Edition (Sites & Witek,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2931) Digital Press)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2932) 	Chapter 5.2: Physical Address Space Characteristics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2933) 	Chapter 5.4: Caches and Write Buffers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2934) 	Chapter 5.5: Data Sharing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2935) 	Chapter 5.6: Read/Write Ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2936) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2937) AMD64 Architecture Programmer's Manual Volume 2: System Programming
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2938) 	Chapter 7.1: Memory-Access Ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2939) 	Chapter 7.4: Buffering and Combining Memory Writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2940) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2941) ARM Architecture Reference Manual (ARMv8, for ARMv8-A architecture profile)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2942) 	Chapter B2: The AArch64 Application Level Memory Model
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2943) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2944) IA-32 Intel Architecture Software Developer's Manual, Volume 3:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2945) System Programming Guide
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2946) 	Chapter 7.1: Locked Atomic Operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2947) 	Chapter 7.2: Memory Ordering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2948) 	Chapter 7.4: Serializing Instructions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2949) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2950) The SPARC Architecture Manual, Version 9
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2951) 	Chapter 8: Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2952) 	Appendix D: Formal Specification of the Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2953) 	Appendix J: Programming with the Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2954) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2955) Storage in the PowerPC (Stone and Fitzgerald)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2956) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2957) UltraSPARC Programmer Reference Manual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2958) 	Chapter 5: Memory Accesses and Cacheability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2959) 	Chapter 15: Sparc-V9 Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2960) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2961) UltraSPARC III Cu User's Manual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2962) 	Chapter 9: Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2963) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2964) UltraSPARC IIIi Processor User's Manual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2965) 	Chapter 8: Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2966) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2967) UltraSPARC Architecture 2005
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2968) 	Chapter 9: Memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2969) 	Appendix D: Formal Specifications of the Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2970) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2971) UltraSPARC T1 Supplement to the UltraSPARC Architecture 2005
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2972) 	Chapter 8: Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2973) 	Appendix F: Caches and Cache Coherency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2974) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2975) Solaris Internals, Core Kernel Architecture, p63-68:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2976) 	Chapter 3.3: Hardware Considerations for Locks and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2977) 			Synchronization
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2978) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2979) Unix Systems for Modern Architectures, Symmetric Multiprocessing and Caching
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2980) for Kernel Programmers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2981) 	Chapter 13: Other Memory Models
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2982) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2983) Intel Itanium Architecture Software Developer's Manual: Volume 1:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2984) 	Section 2.6: Speculation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2985) 	Section 4.4: Memory Access