^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ==============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Ordering I/O writes to memory-mapped addresses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ==============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) On some platforms, so-called memory-mapped I/O is weakly ordered. On such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) platforms, driver writers are responsible for ensuring that I/O writes to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) memory-mapped addresses on their device arrive in the order intended. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) typically done by reading a 'safe' device or bridge register, causing the I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) chipset to flush pending writes to the device before any reads are posted. A
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) driver would usually use this technique immediately prior to the exit of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) critical section of code protected by spinlocks. This would ensure that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) subsequent writes to I/O space arrived only after all prior writes (much like a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) memory barrier op, mb(), only with respect to I/O).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) A more concrete example from a hypothetical device driver::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) CPU A: spin_lock_irqsave(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) CPU A: val = readl(my_status);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) CPU A: ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) CPU A: writel(newval, ring_ptr);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) CPU A: spin_unlock_irqrestore(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) CPU B: spin_lock_irqsave(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) CPU B: val = readl(my_status);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) CPU B: ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) CPU B: writel(newval2, ring_ptr);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) CPU B: spin_unlock_irqrestore(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) In the case above, the device may receive newval2 before it receives newval,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) which could cause problems. Fixing it is easy enough though::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) CPU A: spin_lock_irqsave(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) CPU A: val = readl(my_status);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) CPU A: ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) CPU A: writel(newval, ring_ptr);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) CPU A: (void)readl(safe_register); /* maybe a config register? */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) CPU A: spin_unlock_irqrestore(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) CPU B: spin_lock_irqsave(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) CPU B: val = readl(my_status);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) CPU B: ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) CPU B: writel(newval2, ring_ptr);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) CPU B: (void)readl(safe_register); /* maybe a config register? */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) CPU B: spin_unlock_irqrestore(&dev_lock, flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) Here, the reads from safe_register will cause the I/O chipset to flush any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) pending writes before actually posting the read to the chipset, preventing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) possible data corruption.