^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. _array_rcu_doc:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) Using RCU to Protect Read-Mostly Arrays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) =======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) Although RCU is more commonly used to protect linked lists, it can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) also be used to protect arrays. Three situations are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) 1. :ref:`Hash Tables <hash_tables>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) 2. :ref:`Static Arrays <static_arrays>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) 3. :ref:`Resizable Arrays <resizable_arrays>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) Each of these three situations involves an RCU-protected pointer to an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) array that is separately indexed. It might be tempting to consider use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) of RCU to instead protect the index into an array, however, this use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) case is **not** supported. The problem with RCU-protected indexes into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) arrays is that compilers can play way too many optimization games with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) integers, which means that the rules governing handling of these indexes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) are far more trouble than they are worth. If RCU-protected indexes into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) arrays prove to be particularly valuable (which they have not thus far),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) explicit cooperation from the compiler will be required to permit them
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) to be safely used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) That aside, each of the three RCU-protected pointer situations are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) described in the following sections.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) .. _hash_tables:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) Situation 1: Hash Tables
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) Hash tables are often implemented as an array, where each array entry
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) has a linked-list hash chain. Each hash chain can be protected by RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) as described in the listRCU.txt document. This approach also applies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) to other array-of-list situations, such as radix trees.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) .. _static_arrays:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) Situation 2: Static Arrays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) Static arrays, where the data (rather than a pointer to the data) is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) located in each array element, and where the array is never resized,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) have not been used with RCU. Rik van Riel recommends using seqlock in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) this situation, which would also have minimal read-side overhead as long
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) as updates are rare.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) Quick Quiz:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) Why is it so important that updates be rare when using seqlock?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) :ref:`Answer to Quick Quiz <answer_quick_quiz_seqlock>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) .. _resizable_arrays:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) Situation 3: Resizable Arrays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) ------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) Use of RCU for resizable arrays is demonstrated by the grow_ary()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) function formerly used by the System V IPC code. The array is used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) to map from semaphore, message-queue, and shared-memory IDs to the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) structure that represents the corresponding IPC construct. The grow_ary()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) function does not acquire any locks; instead its caller must hold the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) ids->sem semaphore.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) The grow_ary() function, shown below, does some limit checks, allocates a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) new ipc_id_ary, copies the old to the new portion of the new, initializes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) the remainder of the new, updates the ids->entries pointer to point to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) the new array, and invokes ipc_rcu_putref() to free up the old array.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) Note that rcu_assign_pointer() is used to update the ids->entries pointer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) which includes any memory barriers required on whatever architecture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) you are running on::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) static int grow_ary(struct ipc_ids* ids, int newsize)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) struct ipc_id_ary* new;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) struct ipc_id_ary* old;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) int i;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) int size = ids->entries->size;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) if(newsize > IPCMNI)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) newsize = IPCMNI;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) if(newsize <= size)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) return newsize;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize +
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) sizeof(struct ipc_id_ary));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) if(new == NULL)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) return size;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) new->size = newsize;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) memcpy(new->p, ids->entries->p,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) sizeof(struct kern_ipc_perm *)*size +
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) sizeof(struct ipc_id_ary));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) for(i=size;i<newsize;i++) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) new->p[i] = NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) old = ids->entries;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) /*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) * Use rcu_assign_pointer() to make sure the memcpyed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) * contents of the new array are visible before the new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) * array becomes visible.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) rcu_assign_pointer(ids->entries, new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) ipc_rcu_putref(old);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) return newsize;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) The ipc_rcu_putref() function decrements the array's reference count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) and then, if the reference count has dropped to zero, uses call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) to free the array after a grace period has elapsed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) The array is traversed by the ipc_lock() function. This function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) indexes into the array under the protection of rcu_read_lock(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) using rcu_dereference() to pick up the pointer to the array so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) that it may later safely be dereferenced -- memory barriers are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) required on the Alpha CPU. Since the size of the array is stored
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) with the array itself, there can be no array-size mismatches, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) a simple check suffices. The pointer to the structure corresponding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) to the desired IPC object is placed in "out", with NULL indicating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) a non-existent entry. After acquiring "out->lock", the "out->deleted"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) flag indicates whether the IPC object is in the process of being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) deleted, and, if not, the pointer is returned::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) struct kern_ipc_perm* out;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) int lid = id % SEQ_MULTIPLIER;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) struct ipc_id_ary* entries;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) rcu_read_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) entries = rcu_dereference(ids->entries);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) if(lid >= entries->size) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) out = entries->p[lid];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) if(out == NULL) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) spin_lock(&out->lock);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) /* ipc_rmid() may have already freed the ID while ipc_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) * was spinning: here verify that the structure is still valid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) if (out->deleted) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) spin_unlock(&out->lock);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) return out;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) .. _answer_quick_quiz_seqlock:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) Answer to Quick Quiz:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) Why is it so important that updates be rare when using seqlock?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) The reason that it is important that updates be rare when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) using seqlock is that frequent updates can livelock readers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) One way to avoid this problem is to assign a seqlock for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) each array entry rather than to the entire array.