^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) In-kernel memory-mapped I/O tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) Home page and links to optional user space tools:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) https://nouveau.freedesktop.org/wiki/MmioTrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) MMIO tracing was originally developed by Intel around 2003 for their Fault
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) Injection Test Harness. In Dec 2006 - Jan 2007, using the code from Intel,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) Jeff Muizelaar created a tool for tracing MMIO accesses with the Nouveau
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) project in mind. Since then many people have contributed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) Mmiotrace was built for reverse engineering any memory-mapped IO device with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) the Nouveau project as the first real user. Only x86 and x86_64 architectures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) are supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) Out-of-tree mmiotrace was originally modified for mainline inclusion and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) ftrace framework by Pekka Paalanen <pq@iki.fi>.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) Preparation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) Mmiotrace feature is compiled in by the CONFIG_MMIOTRACE option. Tracing is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) disabled by default, so it is safe to have this set to yes. SMP systems are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) supported, but tracing is unreliable and may miss events if more than one CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) is on-line, therefore mmiotrace takes all but one CPU off-line during run-time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) activation. You can re-enable CPUs by hand, but you have been warned, there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) is no way to automatically detect if you are losing events due to CPUs racing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) Usage Quick Reference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) $ mount -t debugfs debugfs /sys/kernel/debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) Start X or whatever.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) $ echo nop > /sys/kernel/debug/tracing/current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) Check for lost events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) Usage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) Make sure debugfs is mounted to /sys/kernel/debug.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) If not (requires root privileges)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) $ mount -t debugfs debugfs /sys/kernel/debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) Check that the driver you are about to trace is not loaded.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) Activate mmiotrace (requires root privileges)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) $ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) Start storing the trace::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) $ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) The 'cat' process should stay running (sleeping) in the background.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) accesses to areas that are ioremapped while mmiotrace is active.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) During tracing you can place comments (markers) into the trace by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) $ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) This makes it easier to see which part of the (huge) trace corresponds to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) which action. It is recommended to place descriptive markers about what you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) do.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) Shut down mmiotrace (requires root privileges)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) $ echo nop > /sys/kernel/debug/tracing/current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) pressing ctrl+c.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) Check that mmiotrace did not lose events due to a buffer filling up. Either::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) $ grep -i lost mydump.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) which tells you exactly how many events were lost, or use::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) $ dmesg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) to view your kernel log and look for "mmiotrace has lost events" warning. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) events were lost, the trace is incomplete. You should enlarge the buffers and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) try again. Buffers are enlarged by first seeing how large the current buffers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) are::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) $ cat /sys/kernel/debug/tracing/buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) gives you a number. Approximately double this number and write it back, for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) instance::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) $ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) Then start again from the top.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) If you are doing a trace for a driver project, e.g. Nouveau, you should also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) do the following before sending your results::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) $ lspci -vvv > lspci.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) $ dmesg > dmesg.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) $ tar zcf pciid-nick-mmiotrace.tar.gz mydump.txt lspci.txt dmesg.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) and then send the .tar.gz file. The trace compresses considerably. Replace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) "pciid" and "nick" with the PCI ID or model name of your piece of hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) under investigation and your nickname.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) How Mmiotrace Works
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) Access to hardware IO-memory is gained by mapping addresses from PCI bus by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) calling one of the ioremap_*() functions. Mmiotrace is hooked into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) __ioremap() function and gets called whenever a mapping is created. Mapping is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) an event that is recorded into the trace log. Note that ISA range mappings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) are not caught, since the mapping always exists and is returned directly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) MMIO accesses are recorded via page faults. Just before __ioremap() returns,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) the mapped pages are marked as not present. Any access to the pages causes a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) fault. The page fault handler calls mmiotrace to handle the fault. Mmiotrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) marks the page present, sets TF flag to achieve single stepping and exits the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) fault handler. The instruction that faulted is executed and debug trap is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) entered. Here mmiotrace again marks the page as not present. The instruction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) is decoded to get the type of operation (read/write), data width and the value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) read or written. These are stored to the trace log.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) Setting the page present in the page fault handler has a race condition on SMP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) machines. During the single stepping other CPUs may run freely on that page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) and events can be missed without a notice. Re-enabling other CPUs during
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) tracing is discouraged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) Trace Log Format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) The raw log is text and easily filtered with e.g. grep and awk. One record is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) one line in the log. A record starts with a keyword, followed by keyword-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) dependent arguments. Arguments are separated by a space, or continue until the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) end of line. The format for version 20070824 is as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) Explanation Keyword Space-separated arguments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) ---------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) read event R width, timestamp, map id, physical, value, PC, PID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) write event W width, timestamp, map id, physical, value, PC, PID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) ioremap event MAP timestamp, map id, physical, virtual, length, PC, PID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) iounmap event UNMAP timestamp, map id, PC, PID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) marker MARK timestamp, text
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) version VERSION the string "20070824"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) info for reader LSPCI one line from lspci -v
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) PCI address map PCIDEV space-separated /proc/bus/pci/devices data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) unk. opcode UNKNOWN timestamp, map id, physical, data, PC, PID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) Timestamp is in seconds with decimals. Physical is a PCI bus address, virtual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) is a kernel virtual address. Width is the data width in bytes and value is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) data value. Map id is an arbitrary id number identifying the mapping that was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) used in an operation. PC is the program counter and PID is process id. PC is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) zero if it is not recorded. PID is always zero as tracing MMIO accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) originating in user space memory is not yet supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) For instance, the following awk filter will pass all 32-bit writes that target
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) physical addresses in the range [0xfb73ce40, 0xfb800000]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) $ awk '/W 4 / { adr=strtonum($5); if (adr >= 0xfb73ce40 &&
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) adr < 0xfb800000) print; }'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) Tools for Developers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) The user space tools include utilities for:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) - replacing numeric addresses and values with hardware register names
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) - replaying MMIO logs, i.e., re-executing the recorded writes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184)