Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. _array_rcu_doc:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) Using RCU to Protect Read-Mostly Arrays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) =======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) Although RCU is more commonly used to protect linked lists, it can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) also be used to protect arrays.  Three situations are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 1.  :ref:`Hash Tables <hash_tables>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) 2.  :ref:`Static Arrays <static_arrays>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 3.  :ref:`Resizable Arrays <resizable_arrays>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) Each of these three situations involves an RCU-protected pointer to an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) array that is separately indexed.  It might be tempting to consider use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) of RCU to instead protect the index into an array, however, this use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) case is **not** supported.  The problem with RCU-protected indexes into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) arrays is that compilers can play way too many optimization games with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) integers, which means that the rules governing handling of these indexes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) are far more trouble than they are worth.  If RCU-protected indexes into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) arrays prove to be particularly valuable (which they have not thus far),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) explicit cooperation from the compiler will be required to permit them
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) to be safely used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) That aside, each of the three RCU-protected pointer situations are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) described in the following sections.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) .. _hash_tables:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) Situation 1: Hash Tables
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) Hash tables are often implemented as an array, where each array entry
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) has a linked-list hash chain.  Each hash chain can be protected by RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) as described in the listRCU.txt document.  This approach also applies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) to other array-of-list situations, such as radix trees.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) .. _static_arrays:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) Situation 2: Static Arrays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) Static arrays, where the data (rather than a pointer to the data) is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) located in each array element, and where the array is never resized,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) have not been used with RCU.  Rik van Riel recommends using seqlock in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) this situation, which would also have minimal read-side overhead as long
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) as updates are rare.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) Quick Quiz:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 		Why is it so important that updates be rare when using seqlock?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) :ref:`Answer to Quick Quiz <answer_quick_quiz_seqlock>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) .. _resizable_arrays:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) Situation 3: Resizable Arrays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) ------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) Use of RCU for resizable arrays is demonstrated by the grow_ary()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) function formerly used by the System V IPC code.  The array is used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) to map from semaphore, message-queue, and shared-memory IDs to the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) structure that represents the corresponding IPC construct.  The grow_ary()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) function does not acquire any locks; instead its caller must hold the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) ids->sem semaphore.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) The grow_ary() function, shown below, does some limit checks, allocates a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) new ipc_id_ary, copies the old to the new portion of the new, initializes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) the remainder of the new, updates the ids->entries pointer to point to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) the new array, and invokes ipc_rcu_putref() to free up the old array.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) Note that rcu_assign_pointer() is used to update the ids->entries pointer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) which includes any memory barriers required on whatever architecture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) you are running on::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 	static int grow_ary(struct ipc_ids* ids, int newsize)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 		struct ipc_id_ary* new;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 		struct ipc_id_ary* old;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 		int i;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 		int size = ids->entries->size;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 		if(newsize > IPCMNI)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) 			newsize = IPCMNI;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 		if(newsize <= size)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) 			return newsize;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 		new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize +
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 				    sizeof(struct ipc_id_ary));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 		if(new == NULL)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 			return size;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 		new->size = newsize;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 		memcpy(new->p, ids->entries->p,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 		       sizeof(struct kern_ipc_perm *)*size +
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 		       sizeof(struct ipc_id_ary));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 		for(i=size;i<newsize;i++) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 			new->p[i] = NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 		}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) 		old = ids->entries;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 		/*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 		 * Use rcu_assign_pointer() to make sure the memcpyed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 		 * contents of the new array are visible before the new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 		 * array becomes visible.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 		 */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 		rcu_assign_pointer(ids->entries, new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 		ipc_rcu_putref(old);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 		return newsize;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) The ipc_rcu_putref() function decrements the array's reference count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) and then, if the reference count has dropped to zero, uses call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) to free the array after a grace period has elapsed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) The array is traversed by the ipc_lock() function.  This function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) indexes into the array under the protection of rcu_read_lock(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) using rcu_dereference() to pick up the pointer to the array so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) that it may later safely be dereferenced -- memory barriers are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) required on the Alpha CPU.  Since the size of the array is stored
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) with the array itself, there can be no array-size mismatches, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) a simple check suffices.  The pointer to the structure corresponding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) to the desired IPC object is placed in "out", with NULL indicating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) a non-existent entry.  After acquiring "out->lock", the "out->deleted"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) flag indicates whether the IPC object is in the process of being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) deleted, and, if not, the pointer is returned::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 	struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 		struct kern_ipc_perm* out;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 		int lid = id % SEQ_MULTIPLIER;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 		struct ipc_id_ary* entries;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 		rcu_read_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 		entries = rcu_dereference(ids->entries);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 		if(lid >= entries->size) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 			rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 			return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 		}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 		out = entries->p[lid];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 		if(out == NULL) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 			rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 			return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 		}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 		spin_lock(&out->lock);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 		/* ipc_rmid() may have already freed the ID while ipc_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 		 * was spinning: here verify that the structure is still valid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 		 */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 		if (out->deleted) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 			spin_unlock(&out->lock);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 			rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 			return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 		}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 		return out;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) .. _answer_quick_quiz_seqlock:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) Answer to Quick Quiz:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 	Why is it so important that updates be rare when using seqlock?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 	The reason that it is important that updates be rare when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) 	using seqlock is that frequent updates can livelock readers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) 	One way to avoid this problem is to assign a seqlock for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) 	each array entry rather than to the entire array.