^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) NMI Trace Events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) These events normally show up here:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) /sys/kernel/debug/tracing/events/nmi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) nmi_handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) You might want to use this tracepoint if you suspect that your
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) NMI handlers are hogging large amounts of CPU time. The kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) will warn if it sees long-running handlers::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) INFO: NMI handler took too long to run: 9.207 msecs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) and this tracepoint will allow you to drill down and get some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) more details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) Let's say you suspect that perf_event_nmi_handler() is causing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) you some problems and you only want to trace that handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) specifically. You need to find its address::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) $ grep perf_event_nmi_handler /proc/kallsyms
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) ffffffff81625600 t perf_event_nmi_handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) Let's also say you are only interested in when that function is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) really hogging a lot of CPU time, like a millisecond at a time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) Note that the kernel's output is in milliseconds, but the input
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) to the filter is in nanoseconds! You can filter on 'delta_ns'::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) cd /sys/kernel/debug/tracing/events/nmi/nmi_handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) echo 'handler==0xffffffff81625600 && delta_ns>1000000' > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) echo 1 > enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) Your output would then look like::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) $ cat /sys/kernel/debug/tracing/trace_pipe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) <idle>-0 [000] d.h3 505.397558: nmi_handler: perf_event_nmi_handler() delta_ns: 3236765 handled: 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) <idle>-0 [000] d.h3 505.805893: nmi_handler: perf_event_nmi_handler() delta_ns: 3174234 handled: 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) <idle>-0 [000] d.h3 506.158206: nmi_handler: perf_event_nmi_handler() delta_ns: 3084642 handled: 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) <idle>-0 [000] d.h3 506.334346: nmi_handler: perf_event_nmi_handler() delta_ns: 3080351 handled: 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)