Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. _NMI_rcu_doc:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) Using RCU to Protect Dynamic NMI Handlers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) =========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) Although RCU is usually used to protect read-mostly data structures,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) it is possible to use RCU to provide dynamic non-maskable interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) handlers, as well as dynamic irq handlers.  This document describes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) work in "arch/x86/oprofile/nmi_timer_int.c" and in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) "arch/x86/kernel/traps.c".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) The relevant pieces of code are listed below, each followed by a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) brief explanation::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 	static int dummy_nmi_callback(struct pt_regs *regs, int cpu)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 		return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) The dummy_nmi_callback() function is a "dummy" NMI handler that does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) nothing, but returns zero, thus saying that it did nothing, allowing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) the NMI handler to take the default machine-specific action::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 	static nmi_callback_t nmi_callback = dummy_nmi_callback;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) This nmi_callback variable is a global function pointer to the current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) NMI handler::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 	void do_nmi(struct pt_regs * regs, long error_code)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 		int cpu;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 		nmi_enter();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 		cpu = smp_processor_id();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 		++nmi_count(cpu);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 		if (!rcu_dereference_sched(nmi_callback)(regs, cpu))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) 			default_do_nmi(regs);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 		nmi_exit();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) The do_nmi() function processes each NMI.  It first disables preemption
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) in the same way that a hardware irq would, then increments the per-CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) count of NMIs.  It then invokes the NMI handler stored in the nmi_callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) function pointer.  If this handler returns zero, do_nmi() invokes the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) default_do_nmi() function to handle a machine-specific NMI.  Finally,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) preemption is restored.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) In theory, rcu_dereference_sched() is not needed, since this code runs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) only on i386, which in theory does not need rcu_dereference_sched()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) anyway.  However, in practice it is a good documentation aid, particularly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) for anyone attempting to do something similar on Alpha or on systems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) with aggressive optimizing compilers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) Quick Quiz:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 		Why might the rcu_dereference_sched() be necessary on Alpha, given that the code referenced by the pointer is read-only?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) :ref:`Answer to Quick Quiz <answer_quick_quiz_NMI>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) Back to the discussion of NMI and RCU::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 	void set_nmi_callback(nmi_callback_t callback)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 		rcu_assign_pointer(nmi_callback, callback);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) The set_nmi_callback() function registers an NMI handler.  Note that any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) data that is to be used by the callback must be initialized up -before-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) the call to set_nmi_callback().  On architectures that do not order
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) writes, the rcu_assign_pointer() ensures that the NMI handler sees the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) initialized values::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 	void unset_nmi_callback(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 		rcu_assign_pointer(nmi_callback, dummy_nmi_callback);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 	}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) This function unregisters an NMI handler, restoring the original
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) dummy_nmi_handler().  However, there may well be an NMI handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) currently executing on some other CPU.  We therefore cannot free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) up any data structures used by the old NMI handler until execution
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) of it completes on all other CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) One way to accomplish this is via synchronize_rcu(), perhaps as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) follows::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 	unset_nmi_callback();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 	synchronize_rcu();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 	kfree(my_nmi_data);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) This works because (as of v4.20) synchronize_rcu() blocks until all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) CPUs complete any preemption-disabled segments of code that they were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) executing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) Since NMI handlers disable preemption, synchronize_rcu() is guaranteed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) not to return until all ongoing NMI handlers exit.  It is therefore safe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) to free up the handler's data as soon as synchronize_rcu() returns.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) Important note: for this to work, the architecture in question must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) invoke nmi_enter() and nmi_exit() on NMI entry and exit, respectively.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) .. _answer_quick_quiz_NMI:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) Answer to Quick Quiz:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 	Why might the rcu_dereference_sched() be necessary on Alpha, given that the code referenced by the pointer is read-only?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 	The caller to set_nmi_callback() might well have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) 	initialized some data that is to be used by the new NMI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 	handler.  In this case, the rcu_dereference_sched() would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 	be needed, because otherwise a CPU that received an NMI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 	just after the new handler was set might see the pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 	to the new NMI handler, but the old pre-initialized
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 	version of the handler's data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 	This same sad story can happen on other CPUs when using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 	a compiler with aggressive pointer-value speculation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 	optimizations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 	More important, the rcu_dereference_sched() makes it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 	clear to someone reading the code that the pointer is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 	being protected by RCU-sched.