^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) ftrace - Function Tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) Copyright 2008 Red Hat Inc.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) :Author: Steven Rostedt <srostedt@redhat.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) :License: The GNU Free Documentation License, Version 1.2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) (dual licensed under the GPL v2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) :Original Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) John Kacur, and David Teigland.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) - Written for: 2.6.28-rc2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) - Updated for: 3.10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) - Updated for: 4.13 - Copyright 2017 VMware Inc. Steven Rostedt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) - Converted to rst format - Changbin Du <changbin.du@intel.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) Ftrace is an internal tracer designed to help out developers and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) designers of systems to find what is going on inside the kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) It can be used for debugging or analyzing latencies and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) performance issues that take place outside of user-space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) Although ftrace is typically considered the function tracer, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) is really a framework of several assorted tracing utilities.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) There's latency tracing to examine what occurs between interrupts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) disabled and enabled, as well as for preemption and from a time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) a task is woken to the task is actually scheduled in.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) One of the most common uses of ftrace is the event tracing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) Throughout the kernel is hundreds of static event points that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) can be enabled via the tracefs file system to see what is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) going on in certain parts of the kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) See events.rst for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) Implementation Details
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) See :doc:`ftrace-design` for details for arch porters and such.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) The File System
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) Ftrace uses the tracefs file system to hold the control files as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) well as the files to display output.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) When tracefs is configured into the kernel (which selecting any ftrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) option will do) the directory /sys/kernel/tracing will be created. To mount
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) this directory, you can add to your /etc/fstab file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) tracefs /sys/kernel/tracing tracefs defaults 0 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) Or you can mount it at run time with::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) mount -t tracefs nodev /sys/kernel/tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) For quicker access to that directory you may want to make a soft link to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) ln -s /sys/kernel/tracing /tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) .. attention::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) Before 4.1, all ftrace tracing control files were within the debugfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) file system, which is typically located at /sys/kernel/debug/tracing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) For backward compatibility, when mounting the debugfs file system,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) the tracefs file system will be automatically mounted at:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) /sys/kernel/debug/tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) All files located in the tracefs file system will be located in that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) debugfs file system directory as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) In order to not automount tracefs in the debugfs filesystem, enable the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) defconfig option CONFIG_TRACEFS_DISABLE_AUTOMOUNT.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) .. attention::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) Any selected ftrace option will also create the tracefs file system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) The rest of the document will assume that you are in the ftrace directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) (cd /sys/kernel/tracing) and will only concentrate on the files within that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) directory and not distract from the content with the extended
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) "/sys/kernel/tracing" path name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) That's it! (assuming that you have ftrace configured into your kernel)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) After mounting tracefs you will have access to the control and output files
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) of ftrace. Here is a list of some of the key files:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) Note: all time values are in microseconds.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) current_tracer:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) This is used to set or display the current tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) that is configured. Changing the current tracer clears
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) the ring buffer content as well as the "snapshot" buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) available_tracers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) This holds the different types of tracers that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) have been compiled into the kernel. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) tracers listed here can be configured by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) echoing their name into current_tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) tracing_on:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) This sets or displays whether writing to the trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) ring buffer is enabled. Echo 0 into this file to disable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) the tracer or 1 to enable it. Note, this only disables
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) writing to the ring buffer, the tracing overhead may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) still be occurring.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) The kernel function tracing_off() can be used within the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) kernel to disable writing to the ring buffer, which will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) set this file to "0". User space can re-enable tracing by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) echoing "1" into the file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Note, the function and event trigger "traceoff" will also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) set this file to zero and stop tracing. Which can also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) be re-enabled by user space using this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) trace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) This file holds the output of the trace in a human
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) readable format (described below). Opening this file for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) writing with the O_TRUNC flag clears the ring buffer content.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) Note, this file is not a consumer. If tracing is off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) (no tracer running, or tracing_on is zero), it will produce
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) the same output each time it is read. When tracing is on,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) it may produce inconsistent results as it tries to read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) the entire buffer without consuming it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) trace_pipe:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) The output is the same as the "trace" file but this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) file is meant to be streamed with live tracing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) Reads from this file will block until new data is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) retrieved. Unlike the "trace" file, this file is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) consumer. This means reading from this file causes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) sequential reads to display more current data. Once
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) data is read from this file, it is consumed, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) will not be read again with a sequential read. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) "trace" file is static, and if the tracer is not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) adding more data, it will display the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) information every time it is read.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) trace_options:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) This file lets the user control the amount of data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) that is displayed in one of the above output
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) files. Options also exist to modify how a tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) or events work (stack traces, timestamps, etc).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) options:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) This is a directory that has a file for every available
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) trace option (also in trace_options). Options may also be set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) or cleared by writing a "1" or "0" respectively into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) corresponding file with the option name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) tracing_max_latency:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) Some of the tracers record the max latency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) For example, the maximum time that interrupts are disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) The maximum time is saved in this file. The max trace will also be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) stored, and displayed by "trace". A new max trace will only be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) recorded if the latency is greater than the value in this file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) (in microseconds).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) By echoing in a time into this file, no latency will be recorded
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) unless it is greater than the time in this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) tracing_thresh:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) Some latency tracers will record a trace whenever the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) latency is greater than the number in this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) Only active when the file contains a number greater than 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) (in microseconds)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) buffer_size_kb:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) This sets or displays the number of kilobytes each CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) buffer holds. By default, the trace buffers are the same size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) for each CPU. The displayed number is the size of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) CPU buffer and not total size of all buffers. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) trace buffers are allocated in pages (blocks of memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) that the kernel uses for allocation, usually 4 KB in size).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) A few extra pages may be allocated to accommodate buffer management
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) meta-data. If the last page allocated has room for more bytes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) than requested, the rest of the page will be used,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) making the actual allocation bigger than requested or shown.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) ( Note, the size may not be a multiple of the page size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) due to buffer management meta-data. )
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) Buffer sizes for individual CPUs may vary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) (see "per_cpu/cpu0/buffer_size_kb" below), and if they do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) this file will show "X".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) buffer_total_size_kb:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) This displays the total combined size of all the trace buffers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) free_buffer:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) If a process is performing tracing, and the ring buffer should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) shrunk "freed" when the process is finished, even if it were to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) killed by a signal, this file can be used for that purpose. On close
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) of this file, the ring buffer will be resized to its minimum size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) Having a process that is tracing also open this file, when the process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) exits its file descriptor for this file will be closed, and in doing so,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) the ring buffer will be "freed".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) It may also stop tracing if disable_on_free option is set.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) tracing_cpumask:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) This is a mask that lets the user only trace on specified CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) The format is a hex string representing the CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) set_ftrace_filter:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) When dynamic ftrace is configured in (see the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) section below "dynamic ftrace"), the code is dynamically
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) modified (code text rewrite) to disable calling of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) function profiler (mcount). This lets tracing be configured
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) in with practically no overhead in performance. This also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) has a side effect of enabling or disabling specific functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) to be traced. Echoing names of functions into this file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) will limit the trace to only those functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) This influences the tracers "function" and "function_graph"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) and thus also function profiling (see "function_profile_enabled").
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) The functions listed in "available_filter_functions" are what
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) can be written into this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) This interface also allows for commands to be used. See the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) "Filter commands" section for more details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) As a speed up, since processing strings can be quite expensive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) and requires a check of all functions registered to tracing, instead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) an index can be written into this file. A number (starting with "1")
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) written will instead select the same corresponding at the line position
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) of the "available_filter_functions" file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) set_ftrace_notrace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) This has an effect opposite to that of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) set_ftrace_filter. Any function that is added here will not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) be traced. If a function exists in both set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) and set_ftrace_notrace, the function will _not_ be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) set_ftrace_pid:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) Have the function tracer only trace the threads whose PID are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) listed in this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) If the "function-fork" option is set, then when a task whose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) PID is listed in this file forks, the child's PID will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) automatically be added to this file, and the child will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) traced by the function tracer as well. This option will also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) cause PIDs of tasks that exit to be removed from the file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) set_ftrace_notrace_pid:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) Have the function tracer ignore threads whose PID are listed in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) If the "function-fork" option is set, then when a task whose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) PID is listed in this file forks, the child's PID will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) automatically be added to this file, and the child will not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) traced by the function tracer as well. This option will also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) cause PIDs of tasks that exit to be removed from the file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) If a PID is in both this file and "set_ftrace_pid", then this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) file takes precedence, and the thread will not be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) set_event_pid:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) Have the events only trace a task with a PID listed in this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) Note, sched_switch and sched_wake_up will also trace events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) listed in this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) To have the PIDs of children of tasks with their PID in this file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) added on fork, enable the "event-fork" option. That option will also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) cause the PIDs of tasks to be removed from this file when the task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) exits.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) set_event_notrace_pid:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) Have the events not trace a task with a PID listed in this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) Note, sched_switch and sched_wakeup will trace threads not listed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) in this file, even if a thread's PID is in the file if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) sched_switch or sched_wakeup events also trace a thread that should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) To have the PIDs of children of tasks with their PID in this file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) added on fork, enable the "event-fork" option. That option will also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) cause the PIDs of tasks to be removed from this file when the task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) exits.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) set_graph_function:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) Functions listed in this file will cause the function graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) tracer to only trace these functions and the functions that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) they call. (See the section "dynamic ftrace" for more details).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) Note, set_ftrace_filter and set_ftrace_notrace still affects
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) what functions are being traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) set_graph_notrace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) Similar to set_graph_function, but will disable function graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) tracing when the function is hit until it exits the function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) This makes it possible to ignore tracing functions that are called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) by a specific function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) available_filter_functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) This lists the functions that ftrace has processed and can trace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) These are the function names that you can pass to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) "set_ftrace_filter", "set_ftrace_notrace",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) "set_graph_function", or "set_graph_notrace".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) (See the section "dynamic ftrace" below for more details.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) dyn_ftrace_total_info:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) This file is for debugging purposes. The number of functions that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) have been converted to nops and are available to be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) enabled_functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) This file is more for debugging ftrace, but can also be useful
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) in seeing if any function has a callback attached to it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) Not only does the trace infrastructure use ftrace function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) trace utility, but other subsystems might too. This file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) displays all functions that have a callback attached to them
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) as well as the number of callbacks that have been attached.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) Note, a callback may also call multiple functions which will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) not be listed in this count.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) If the callback registered to be traced by a function with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) the "save regs" attribute (thus even more overhead), a 'R'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) will be displayed on the same line as the function that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) is returning registers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) If the callback registered to be traced by a function with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) the "ip modify" attribute (thus the regs->ip can be changed),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) an 'I' will be displayed on the same line as the function that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) can be overridden.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) If the architecture supports it, it will also show what callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) is being directly called by the function. If the count is greater
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) than 1 it most likely will be ftrace_ops_list_func().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) If the callback of the function jumps to a trampoline that is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) specific to a the callback and not the standard trampoline,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) its address will be printed as well as the function that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) trampoline calls.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) function_profile_enabled:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) When set it will enable all functions with either the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) tracer, or if configured, the function graph tracer. It will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) keep a histogram of the number of functions that were called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) and if the function graph tracer was configured, it will also keep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) track of the time spent in those functions. The histogram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) content can be displayed in the files:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) trace_stat/function<cpu> ( function0, function1, etc).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) trace_stat:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) A directory that holds different tracing stats.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) kprobe_events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) Enable dynamic trace points. See kprobetrace.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) kprobe_profile:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) Dynamic trace points stats. See kprobetrace.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) max_graph_depth:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) Used with the function graph tracer. This is the max depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) it will trace into a function. Setting this to a value of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) one will show only the first kernel function that is called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) from user space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) printk_formats:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) This is for tools that read the raw format files. If an event in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) the ring buffer references a string, only a pointer to the string
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) is recorded into the buffer and not the string itself. This prevents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) tools from knowing what that string was. This file displays the string
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) and address for the string allowing tools to map the pointers to what
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) the strings were.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) saved_cmdlines:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) Only the pid of the task is recorded in a trace event unless
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) the event specifically saves the task comm as well. Ftrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) makes a cache of pid mappings to comms to try to display
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) comms for events. If a pid for a comm is not listed, then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) "<...>" is displayed in the output.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) If the option "record-cmd" is set to "0", then comms of tasks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) will not be saved during recording. By default, it is enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) saved_cmdlines_size:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) By default, 128 comms are saved (see "saved_cmdlines" above). To
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) increase or decrease the amount of comms that are cached, echo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) the number of comms to cache into this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) saved_tgids:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) If the option "record-tgid" is set, on each scheduling context switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) the Task Group ID of a task is saved in a table mapping the PID of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) the thread to its TGID. By default, the "record-tgid" option is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) snapshot:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) This displays the "snapshot" buffer and also lets the user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) take a snapshot of the current running trace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) See the "Snapshot" section below for more details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) stack_max_size:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) When the stack tracer is activated, this will display the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) maximum stack size it has encountered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) See the "Stack Trace" section below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) stack_trace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) This displays the stack back trace of the largest stack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) that was encountered when the stack tracer is activated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) See the "Stack Trace" section below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) stack_trace_filter:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) This is similar to "set_ftrace_filter" but it limits what
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) functions the stack tracer will check.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) trace_clock:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) Whenever an event is recorded into the ring buffer, a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) "timestamp" is added. This stamp comes from a specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) clock. By default, ftrace uses the "local" clock. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) clock is very fast and strictly per cpu, but on some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) systems it may not be monotonic with respect to other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) CPUs. In other words, the local clocks may not be in sync
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) with local clocks on other CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) Usual clocks for tracing::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) # cat trace_clock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) [local] global counter x86-tsc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) The clock with the square brackets around it is the one in effect.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) local:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) Default clock, but may not be in sync across CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471) global:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) This clock is in sync with all CPUs but may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) be a bit slower than the local clock.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475) counter:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) This is not a clock at all, but literally an atomic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) counter. It counts up one by one, but is in sync
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) with all CPUs. This is useful when you need to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) know exactly the order events occurred with respect to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) each other on different CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) uptime:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) This uses the jiffies counter and the time stamp
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) is relative to the time since boot up.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) perf:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) This makes ftrace use the same clock that perf uses.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) Eventually perf will be able to read ftrace buffers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) and this will help out in interleaving the data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) x86-tsc:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) Architectures may define their own clocks. For
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493) example, x86 uses its own TSC cycle clock here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) ppc-tb:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) This uses the powerpc timebase register value.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) This is in sync across CPUs and can also be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498) to correlate events across hypervisor/guest if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499) tb_offset is known.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) mono:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502) This uses the fast monotonic clock (CLOCK_MONOTONIC)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) which is monotonic and is subject to NTP rate adjustments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) mono_raw:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) This is the raw monotonic clock (CLOCK_MONOTONIC_RAW)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) which is monotonic but is not subject to any rate adjustments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) and ticks at the same rate as the hardware clocksource.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) boot:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) This is the boot clock (CLOCK_BOOTTIME) and is based on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) fast monotonic clock, but also accounts for time spent in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) suspend. Since the clock access is designed for use in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) tracing in the suspend path, some side effects are possible
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) if clock is accessed after the suspend time is accounted before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516) the fast mono clock is updated. In this case, the clock update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) appears to happen slightly sooner than it normally would have.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518) Also on 32-bit systems, it's possible that the 64-bit boot offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) sees a partial update. These effects are rare and post
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) processing should be able to handle them. See comments in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) ktime_get_boot_fast_ns() function for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) To set a clock, simply echo the clock name into this file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525) # echo global > trace_clock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) Setting a clock clears the ring buffer content as well as the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) "snapshot" buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) trace_marker:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532) This is a very useful file for synchronizing user space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533) with events happening in the kernel. Writing strings into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) this file will be written into the ftrace buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) It is useful in applications to open this file at the start
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537) of the application and just reference the file descriptor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538) for the file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) void trace_write(const char *fmt, ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) va_list ap;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543) char buf[256];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) int n;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546) if (trace_fd < 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) return;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549) va_start(ap, fmt);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550) n = vsnprintf(buf, 256, fmt, ap);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551) va_end(ap);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) write(trace_fd, buf, n);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556) start::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) trace_fd = open("trace_marker", WR_ONLY);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) Note: Writing into the trace_marker file can also initiate triggers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561) that are written into /sys/kernel/tracing/events/ftrace/print/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562) See "Event triggers" in Documentation/trace/events.rst and an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563) example in Documentation/trace/histogram.rst (Section 3.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 565) trace_marker_raw:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 566)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 567) This is similar to trace_marker above, but is meant for binary data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 568) to be written to it, where a tool can be used to parse the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 569) from trace_pipe_raw.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 570)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 571) uprobe_events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 572)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 573) Add dynamic tracepoints in programs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 574) See uprobetracer.rst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 575)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 576) uprobe_profile:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 577)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 578) Uprobe statistics. See uprobetrace.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 579)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 580) instances:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 581)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 582) This is a way to make multiple trace buffers where different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 583) events can be recorded in different buffers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 584) See "Instances" section below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 585)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 586) events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 587)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 588) This is the trace event directory. It holds event tracepoints
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 589) (also known as static tracepoints) that have been compiled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 590) into the kernel. It shows what event tracepoints exist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 591) and how they are grouped by system. There are "enable"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 592) files at various levels that can enable the tracepoints
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 593) when a "1" is written to them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 594)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 595) See events.rst for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 596)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 597) set_event:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 598)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 599) By echoing in the event into this file, will enable that event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 600)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 601) See events.rst for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 602)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 603) available_events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 604)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 605) A list of events that can be enabled in tracing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 606)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 607) See events.rst for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 608)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 609) timestamp_mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 610)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 611) Certain tracers may change the timestamp mode used when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 612) logging trace events into the event buffer. Events with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 613) different modes can coexist within a buffer but the mode in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 614) effect when an event is logged determines which timestamp mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 615) is used for that event. The default timestamp mode is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 616) 'delta'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 617)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 618) Usual timestamp modes for tracing:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 619)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 620) # cat timestamp_mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 621) [delta] absolute
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 622)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 623) The timestamp mode with the square brackets around it is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 624) one in effect.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 625)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 626) delta: Default timestamp mode - timestamp is a delta against
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 627) a per-buffer timestamp.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 628)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 629) absolute: The timestamp is a full timestamp, not a delta
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 630) against some other value. As such it takes up more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 631) space and is less efficient.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 632)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 633) hwlat_detector:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 634)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 635) Directory for the Hardware Latency Detector.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 636) See "Hardware Latency Detector" section below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 637)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 638) per_cpu:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 639)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 640) This is a directory that contains the trace per_cpu information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 641)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 642) per_cpu/cpu0/buffer_size_kb:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 643)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 644) The ftrace buffer is defined per_cpu. That is, there's a separate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 645) buffer for each CPU to allow writes to be done atomically,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 646) and free from cache bouncing. These buffers may have different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 647) size buffers. This file is similar to the buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 648) file, but it only displays or sets the buffer size for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 649) specific CPU. (here cpu0).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 650)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 651) per_cpu/cpu0/trace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 652)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 653) This is similar to the "trace" file, but it will only display
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 654) the data specific for the CPU. If written to, it only clears
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 655) the specific CPU buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 656)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 657) per_cpu/cpu0/trace_pipe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 658)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 659) This is similar to the "trace_pipe" file, and is a consuming
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 660) read, but it will only display (and consume) the data specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 661) for the CPU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 662)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 663) per_cpu/cpu0/trace_pipe_raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 664)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 665) For tools that can parse the ftrace ring buffer binary format,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 666) the trace_pipe_raw file can be used to extract the data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 667) from the ring buffer directly. With the use of the splice()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 668) system call, the buffer data can be quickly transferred to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 669) a file or to the network where a server is collecting the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 670) data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 671)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 672) Like trace_pipe, this is a consuming reader, where multiple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 673) reads will always produce different data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 674)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 675) per_cpu/cpu0/snapshot:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 676)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 677) This is similar to the main "snapshot" file, but will only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 678) snapshot the current CPU (if supported). It only displays
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 679) the content of the snapshot for a given CPU, and if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 680) written to, only clears this CPU buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 681)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 682) per_cpu/cpu0/snapshot_raw:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 683)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 684) Similar to the trace_pipe_raw, but will read the binary format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 685) from the snapshot buffer for the given CPU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 686)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 687) per_cpu/cpu0/stats:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 688)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 689) This displays certain stats about the ring buffer:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 690)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 691) entries:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 692) The number of events that are still in the buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 693)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 694) overrun:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 695) The number of lost events due to overwriting when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 696) the buffer was full.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 697)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 698) commit overrun:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 699) Should always be zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 700) This gets set if so many events happened within a nested
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 701) event (ring buffer is re-entrant), that it fills the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 702) buffer and starts dropping events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 703)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 704) bytes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 705) Bytes actually read (not overwritten).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 706)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 707) oldest event ts:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 708) The oldest timestamp in the buffer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 709)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 710) now ts:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 711) The current timestamp
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 712)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 713) dropped events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 714) Events lost due to overwrite option being off.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 715)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 716) read events:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 717) The number of events read.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 718)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 719) The Tracers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 720) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 721)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 722) Here is the list of current tracers that may be configured.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 723)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 724) "function"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 725)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 726) Function call tracer to trace all kernel functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 727)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 728) "function_graph"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 729)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 730) Similar to the function tracer except that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 731) function tracer probes the functions on their entry
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 732) whereas the function graph tracer traces on both entry
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 733) and exit of the functions. It then provides the ability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 734) to draw a graph of function calls similar to C code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 735) source.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 736)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 737) "blk"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 738)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 739) The block tracer. The tracer used by the blktrace user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 740) application.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 741)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 742) "hwlat"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 743)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 744) The Hardware Latency tracer is used to detect if the hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 745) produces any latency. See "Hardware Latency Detector" section
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 746) below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 747)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 748) "irqsoff"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 749)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 750) Traces the areas that disable interrupts and saves
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 751) the trace with the longest max latency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 752) See tracing_max_latency. When a new max is recorded,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 753) it replaces the old trace. It is best to view this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 754) trace with the latency-format option enabled, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 755) happens automatically when the tracer is selected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 756)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 757) "preemptoff"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 758)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 759) Similar to irqsoff but traces and records the amount of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 760) time for which preemption is disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 761)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 762) "preemptirqsoff"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 763)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 764) Similar to irqsoff and preemptoff, but traces and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 765) records the largest time for which irqs and/or preemption
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 766) is disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 767)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 768) "wakeup"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 769)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 770) Traces and records the max latency that it takes for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 771) the highest priority task to get scheduled after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 772) it has been woken up.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 773) Traces all tasks as an average developer would expect.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 774)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 775) "wakeup_rt"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 776)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 777) Traces and records the max latency that it takes for just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 778) RT tasks (as the current "wakeup" does). This is useful
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 779) for those interested in wake up timings of RT tasks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 780)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 781) "wakeup_dl"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 782)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 783) Traces and records the max latency that it takes for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 784) a SCHED_DEADLINE task to be woken (as the "wakeup" and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 785) "wakeup_rt" does).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 786)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 787) "mmiotrace"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 788)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 789) A special tracer that is used to trace binary module.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 790) It will trace all the calls that a module makes to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 791) hardware. Everything it writes and reads from the I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 792) as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 793)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 794) "branch"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 795)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 796) This tracer can be configured when tracing likely/unlikely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 797) calls within the kernel. It will trace when a likely and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 798) unlikely branch is hit and if it was correct in its prediction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 799) of being correct.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 800)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 801) "nop"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 802)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 803) This is the "trace nothing" tracer. To remove all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 804) tracers from tracing simply echo "nop" into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 805) current_tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 806)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 807) Error conditions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 808) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 809)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 810) For most ftrace commands, failure modes are obvious and communicated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 811) using standard return codes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 812)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 813) For other more involved commands, extended error information may be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 814) available via the tracing/error_log file. For the commands that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 815) support it, reading the tracing/error_log file after an error will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 816) display more detailed information about what went wrong, if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 817) information is available. The tracing/error_log file is a circular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 818) error log displaying a small number (currently, 8) of ftrace errors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 819) for the last (8) failed commands.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 820)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 821) The extended error information and usage takes the form shown in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 822) this example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 823)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 824) # echo xxx > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 825) echo: write error: Invalid argument
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 826)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 827) # cat /sys/kernel/debug/tracing/error_log
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 828) [ 5348.887237] location: error: Couldn't yyy: zzz
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 829) Command: xxx
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 830) ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 831) [ 7517.023364] location: error: Bad rrr: sss
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 832) Command: ppp qqq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 833) ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 834)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 835) To clear the error log, echo the empty string into it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 836)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 837) # echo > /sys/kernel/debug/tracing/error_log
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 838)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 839) Examples of using the tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 840) ----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 841)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 842) Here are typical examples of using the tracers when controlling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 843) them only with the tracefs interface (without using any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 844) user-land utilities).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 845)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 846) Output format:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 847) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 848)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 849) Here is an example of the output format of the file "trace"::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 850)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 851) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 852) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 853) # entries-in-buffer/entries-written: 140080/250280 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 854) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 855) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 856) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 857) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 858) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 859) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 860) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 861) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 862) bash-1977 [000] .... 17284.993652: sys_close <-system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 863) bash-1977 [000] .... 17284.993653: __close_fd <-sys_close
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 864) bash-1977 [000] .... 17284.993653: _raw_spin_lock <-__close_fd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 865) sshd-1974 [003] .... 17284.993653: __srcu_read_unlock <-fsnotify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 866) bash-1977 [000] .... 17284.993654: add_preempt_count <-_raw_spin_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 867) bash-1977 [000] ...1 17284.993655: _raw_spin_unlock <-__close_fd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 868) bash-1977 [000] ...1 17284.993656: sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 869) bash-1977 [000] .... 17284.993657: filp_close <-__close_fd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 870) bash-1977 [000] .... 17284.993657: dnotify_flush <-filp_close
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 871) sshd-1974 [003] .... 17284.993658: sys_select <-system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 872) ....
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 873)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 874) A header is printed with the tracer name that is represented by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 875) the trace. In this case the tracer is "function". Then it shows the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 876) number of events in the buffer as well as the total number of entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 877) that were written. The difference is the number of entries that were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 878) lost due to the buffer filling up (250280 - 140080 = 110200 events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 879) lost).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 880)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 881) The header explains the content of the events. Task name "bash", the task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 882) PID "1977", the CPU that it was running on "000", the latency format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 883) (explained below), the timestamp in <secs>.<usecs> format, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 884) function name that was traced "sys_close" and the parent function that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 885) called this function "system_call_fastpath". The timestamp is the time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 886) at which the function was entered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 887)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 888) Latency trace format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 889) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 890)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 891) When the latency-format option is enabled or when one of the latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 892) tracers is set, the trace file gives somewhat more information to see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 893) why a latency happened. Here is a typical trace::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 894)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 895) # tracer: irqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 896) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 897) # irqsoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 898) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 899) # latency: 259 us, #4/4, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 900) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 901) # | task: ps-6143 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 902) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 903) # => started at: __lock_task_sighand
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 904) # => ended at: _raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 905) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 906) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 907) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 908) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 909) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 910) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 911) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 912) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 913) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 914) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 915) ps-6143 2d... 0us!: trace_hardirqs_off <-__lock_task_sighand
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 916) ps-6143 2d..1 259us+: trace_hardirqs_on <-_raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 917) ps-6143 2d..1 263us+: time_hardirqs_on <-_raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 918) ps-6143 2d..1 306us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 919) => trace_hardirqs_on_caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 920) => trace_hardirqs_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 921) => _raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 922) => do_task_stat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 923) => proc_tgid_stat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 924) => proc_single_show
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 925) => seq_read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 926) => vfs_read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 927) => sys_read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 928) => system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 929)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 930)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 931) This shows that the current tracer is "irqsoff" tracing the time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 932) for which interrupts were disabled. It gives the trace version (which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 933) never changes) and the version of the kernel upon which this was executed on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 934) (3.8). Then it displays the max latency in microseconds (259 us). The number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 935) of trace entries displayed and the total number (both are four: #4/4).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 936) VP, KP, SP, and HP are always zero and are reserved for later use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 937) #P is the number of online CPUs (#P:4).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 938)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 939) The task is the process that was running when the latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 940) occurred. (ps pid: 6143).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 941)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 942) The start and stop (the functions in which the interrupts were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 943) disabled and enabled respectively) that caused the latencies:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 944)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 945) - __lock_task_sighand is where the interrupts were disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 946) - _raw_spin_unlock_irqrestore is where they were enabled again.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 947)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 948) The next lines after the header are the trace itself. The header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 949) explains which is which.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 950)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 951) cmd: The name of the process in the trace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 952)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 953) pid: The PID of that process.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 954)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 955) CPU#: The CPU which the process was running on.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 956)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 957) irqs-off: 'd' interrupts are disabled. '.' otherwise.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 958) .. caution:: If the architecture does not support a way to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 959) read the irq flags variable, an 'X' will always
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 960) be printed here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 961)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 962) need-resched:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 963) - 'N' both TIF_NEED_RESCHED and PREEMPT_NEED_RESCHED is set,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 964) - 'n' only TIF_NEED_RESCHED is set,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 965) - 'p' only PREEMPT_NEED_RESCHED is set,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 966) - '.' otherwise.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 967)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 968) hardirq/softirq:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 969) - 'Z' - NMI occurred inside a hardirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 970) - 'z' - NMI is running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 971) - 'H' - hard irq occurred inside a softirq.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 972) - 'h' - hard irq is running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 973) - 's' - soft irq is running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 974) - '.' - normal context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 975)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 976) preempt-depth: The level of preempt_disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 977)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 978) The above is mostly meaningful for kernel developers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 979)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 980) time:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 981) When the latency-format option is enabled, the trace file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 982) output includes a timestamp relative to the start of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 983) trace. This differs from the output when latency-format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 984) is disabled, which includes an absolute timestamp.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 985)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 986) delay:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 987) This is just to help catch your eye a bit better. And
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 988) needs to be fixed to be only relative to the same CPU.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 989) The marks are determined by the difference between this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 990) current trace and the next trace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 991)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 992) - '$' - greater than 1 second
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 993) - '@' - greater than 100 millisecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 994) - '*' - greater than 10 millisecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 995) - '#' - greater than 1000 microsecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 996) - '!' - greater than 100 microsecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 997) - '+' - greater than 10 microsecond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 998) - ' ' - less than or equal to 10 microsecond.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 999)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1000) The rest is the same as the 'trace' file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1001)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1002) Note, the latency tracers will usually end with a back trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1003) to easily find where the latency occurred.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1004)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1005) trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1006) -------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1007)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1008) The trace_options file (or the options directory) is used to control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1009) what gets printed in the trace output, or manipulate the tracers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1010) To see what is available, simply cat the file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1011)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1012) cat trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1013) print-parent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1014) nosym-offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1015) nosym-addr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1016) noverbose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1017) noraw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1018) nohex
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1019) nobin
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1020) noblock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1021) trace_printk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1022) annotate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1023) nouserstacktrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1024) nosym-userobj
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1025) noprintk-msg-only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1026) context-info
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1027) nolatency-format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1028) record-cmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1029) norecord-tgid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1030) overwrite
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1031) nodisable_on_free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1032) irq-info
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1033) markers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1034) noevent-fork
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1035) function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1036) nofunction-fork
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1037) nodisplay-graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1038) nostacktrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1039) nobranch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1040)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1041) To disable one of the options, echo in the option prepended with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1042) "no"::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1043)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1044) echo noprint-parent > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1045)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1046) To enable an option, leave off the "no"::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1047)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1048) echo sym-offset > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1049)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1050) Here are the available options:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1051)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1052) print-parent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1053) On function traces, display the calling (parent)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1054) function as well as the function being traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1055) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1056)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1057) print-parent:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1058) bash-4000 [01] 1477.606694: simple_strtoul <-kstrtoul
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1059)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1060) noprint-parent:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1061) bash-4000 [01] 1477.606694: simple_strtoul
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1062)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1063)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1064) sym-offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1065) Display not only the function name, but also the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1066) offset in the function. For example, instead of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1067) seeing just "ktime_get", you will see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1068) "ktime_get+0xb/0x20".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1069) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1070)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1071) sym-offset:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1072) bash-4000 [01] 1477.606694: simple_strtoul+0x6/0xa0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1073)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1074) sym-addr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1075) This will also display the function address as well
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1076) as the function name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1077) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1078)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1079) sym-addr:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1080) bash-4000 [01] 1477.606694: simple_strtoul <c0339346>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1081)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1082) verbose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1083) This deals with the trace file when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1084) latency-format option is enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1085) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1086)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1087) bash 4000 1 0 00000000 00010a95 [58127d26] 1720.415ms \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1088) (+0.000ms): simple_strtoul (kstrtoul)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1089)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1090) raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1091) This will display raw numbers. This option is best for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1092) use with user applications that can translate the raw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1093) numbers better than having it done in the kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1094)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1095) hex
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1096) Similar to raw, but the numbers will be in a hexadecimal format.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1097)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1098) bin
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1099) This will print out the formats in raw binary.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1101) block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1102) When set, reading trace_pipe will not block when polled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1104) trace_printk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1105) Can disable trace_printk() from writing into the buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1107) annotate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1108) It is sometimes confusing when the CPU buffers are full
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1109) and one CPU buffer had a lot of events recently, thus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1110) a shorter time frame, were another CPU may have only had
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1111) a few events, which lets it have older events. When
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1112) the trace is reported, it shows the oldest events first,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1113) and it may look like only one CPU ran (the one with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1114) oldest events). When the annotate option is set, it will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1115) display when a new CPU buffer started::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1117) <idle>-0 [001] dNs4 21169.031481: wake_up_idle_cpu <-add_timer_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1118) <idle>-0 [001] dNs4 21169.031482: _raw_spin_unlock_irqrestore <-add_timer_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1119) <idle>-0 [001] .Ns4 21169.031484: sub_preempt_count <-_raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1120) ##### CPU 2 buffer started ####
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1121) <idle>-0 [002] .N.1 21169.031484: rcu_idle_exit <-cpu_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1122) <idle>-0 [001] .Ns3 21169.031484: _raw_spin_unlock <-clocksource_watchdog
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1123) <idle>-0 [001] .Ns3 21169.031485: sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1125) userstacktrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1126) This option changes the trace. It records a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1127) stacktrace of the current user space thread after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1128) each trace event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1130) sym-userobj
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1131) when user stacktrace are enabled, look up which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1132) object the address belongs to, and print a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1133) relative address. This is especially useful when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1134) ASLR is on, otherwise you don't get a chance to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1135) resolve the address to object/file/line after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1136) the app is no longer running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1137)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1138) The lookup is performed when you read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1139) trace,trace_pipe. Example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1141) a.out-1623 [000] 40874.465068: /root/a.out[+0x480] <-/root/a.out[+0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1142) x494] <- /root/a.out[+0x4a8] <- /lib/libc-2.7.so[+0x1e1a6]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1144)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1145) printk-msg-only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1146) When set, trace_printk()s will only show the format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1147) and not their parameters (if trace_bprintk() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1148) trace_bputs() was used to save the trace_printk()).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1149)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1150) context-info
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1151) Show only the event data. Hides the comm, PID,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1152) timestamp, CPU, and other useful data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1153)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1154) latency-format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1155) This option changes the trace output. When it is enabled,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1156) the trace displays additional information about the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1157) latency, as described in "Latency trace format".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1159) pause-on-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1160) When set, opening the trace file for read, will pause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1161) writing to the ring buffer (as if tracing_on was set to zero).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1162) This simulates the original behavior of the trace file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1163) When the file is closed, tracing will be enabled again.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1164)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1165) record-cmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1166) When any event or tracer is enabled, a hook is enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1167) in the sched_switch trace point to fill comm cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1168) with mapped pids and comms. But this may cause some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1169) overhead, and if you only care about pids, and not the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1170) name of the task, disabling this option can lower the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1171) impact of tracing. See "saved_cmdlines".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1173) record-tgid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1174) When any event or tracer is enabled, a hook is enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1175) in the sched_switch trace point to fill the cache of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1176) mapped Thread Group IDs (TGID) mapping to pids. See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1177) "saved_tgids".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1178)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1179) overwrite
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1180) This controls what happens when the trace buffer is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1181) full. If "1" (default), the oldest events are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1182) discarded and overwritten. If "0", then the newest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1183) events are discarded.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1184) (see per_cpu/cpu0/stats for overrun and dropped)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1185)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1186) disable_on_free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1187) When the free_buffer is closed, tracing will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1188) stop (tracing_on set to 0).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1189)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1190) irq-info
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1191) Shows the interrupt, preempt count, need resched data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1192) When disabled, the trace looks like::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1193)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1194) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1195) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1196) # entries-in-buffer/entries-written: 144405/9452052 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1197) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1198) # TASK-PID CPU# TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1199) # | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1200) <idle>-0 [002] 23636.756054: ttwu_do_activate.constprop.89 <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1201) <idle>-0 [002] 23636.756054: activate_task <-ttwu_do_activate.constprop.89
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1202) <idle>-0 [002] 23636.756055: enqueue_task <-activate_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1203)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1204)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1205) markers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1206) When set, the trace_marker is writable (only by root).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1207) When disabled, the trace_marker will error with EINVAL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1208) on write.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1209)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1210) event-fork
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1211) When set, tasks with PIDs listed in set_event_pid will have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1212) the PIDs of their children added to set_event_pid when those
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1213) tasks fork. Also, when tasks with PIDs in set_event_pid exit,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1214) their PIDs will be removed from the file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1215)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1216) This affects PIDs listed in set_event_notrace_pid as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1217)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1218) function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1219) The latency tracers will enable function tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1220) if this option is enabled (default it is). When
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1221) it is disabled, the latency tracers do not trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1222) functions. This keeps the overhead of the tracer down
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1223) when performing latency tests.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1224)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1225) function-fork
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1226) When set, tasks with PIDs listed in set_ftrace_pid will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1227) have the PIDs of their children added to set_ftrace_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1228) when those tasks fork. Also, when tasks with PIDs in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1229) set_ftrace_pid exit, their PIDs will be removed from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1230) file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1231)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1232) This affects PIDs in set_ftrace_notrace_pid as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1233)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1234) display-graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1235) When set, the latency tracers (irqsoff, wakeup, etc) will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1236) use function graph tracing instead of function tracing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1237)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1238) stacktrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1239) When set, a stack trace is recorded after any trace event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1240) is recorded.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1241)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1242) branch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1243) Enable branch tracing with the tracer. This enables branch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1244) tracer along with the currently set tracer. Enabling this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1245) with the "nop" tracer is the same as just enabling the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1246) "branch" tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1247)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1248) .. tip:: Some tracers have their own options. They only appear in this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1249) file when the tracer is active. They always appear in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1250) options directory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1251)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1252)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1253) Here are the per tracer options:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1254)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1255) Options for function tracer:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1256)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1257) func_stack_trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1258) When set, a stack trace is recorded after every
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1259) function that is recorded. NOTE! Limit the functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1260) that are recorded before enabling this, with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1261) "set_ftrace_filter" otherwise the system performance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1262) will be critically degraded. Remember to disable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1263) this option before clearing the function filter.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1264)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1265) Options for function_graph tracer:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1266)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1267) Since the function_graph tracer has a slightly different output
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1268) it has its own options to control what is displayed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1269)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1270) funcgraph-overrun
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1271) When set, the "overrun" of the graph stack is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1272) displayed after each function traced. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1273) overrun, is when the stack depth of the calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1274) is greater than what is reserved for each task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1275) Each task has a fixed array of functions to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1276) trace in the call graph. If the depth of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1277) calls exceeds that, the function is not traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1278) The overrun is the number of functions missed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1279) due to exceeding this array.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1280)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1281) funcgraph-cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1282) When set, the CPU number of the CPU where the trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1283) occurred is displayed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1284)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1285) funcgraph-overhead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1286) When set, if the function takes longer than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1287) A certain amount, then a delay marker is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1288) displayed. See "delay" above, under the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1289) header description.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1290)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1291) funcgraph-proc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1292) Unlike other tracers, the process' command line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1293) is not displayed by default, but instead only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1294) when a task is traced in and out during a context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1295) switch. Enabling this options has the command
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1296) of each process displayed at every line.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1297)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1298) funcgraph-duration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1299) At the end of each function (the return)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1300) the duration of the amount of time in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1301) function is displayed in microseconds.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1302)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1303) funcgraph-abstime
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1304) When set, the timestamp is displayed at each line.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1305)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1306) funcgraph-irqs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1307) When disabled, functions that happen inside an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1308) interrupt will not be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1309)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1310) funcgraph-tail
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1311) When set, the return event will include the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1312) that it represents. By default this is off, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1313) only a closing curly bracket "}" is displayed for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1314) the return of a function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1315)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1316) sleep-time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1317) When running function graph tracer, to include
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1318) the time a task schedules out in its function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1319) When enabled, it will account time the task has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1320) scheduled out as part of the function call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1321)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1322) graph-time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1323) When running function profiler with function graph tracer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1324) to include the time to call nested functions. When this is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1325) not set, the time reported for the function will only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1326) include the time the function itself executed for, not the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1327) time for functions that it called.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1328)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1329) Options for blk tracer:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1330)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1331) blk_classic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1332) Shows a more minimalistic output.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1333)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1334)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1335) irqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1336) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1337)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1338) When interrupts are disabled, the CPU can not react to any other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1339) external event (besides NMIs and SMIs). This prevents the timer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1340) interrupt from triggering or the mouse interrupt from letting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1341) the kernel know of a new mouse event. The result is a latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1342) with the reaction time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1343)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1344) The irqsoff tracer tracks the time for which interrupts are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1345) disabled. When a new maximum latency is hit, the tracer saves
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1346) the trace leading up to that latency point so that every time a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1347) new maximum is reached, the old saved trace is discarded and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1348) new trace is saved.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1349)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1350) To reset the maximum, echo 0 into tracing_max_latency. Here is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1351) an example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1352)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1353) # echo 0 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1354) # echo irqsoff > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1355) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1356) # echo 0 > tracing_max_latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1357) # ls -ltr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1358) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1359) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1360) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1361) # tracer: irqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1362) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1363) # irqsoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1364) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1365) # latency: 16 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1366) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1367) # | task: swapper/0-0 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1368) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1369) # => started at: run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1370) # => ended at: run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1371) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1372) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1373) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1374) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1375) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1376) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1377) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1378) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1379) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1380) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1381) <idle>-0 0d.s2 0us+: _raw_spin_lock_irq <-run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1382) <idle>-0 0dNs3 17us : _raw_spin_unlock_irq <-run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1383) <idle>-0 0dNs3 17us+: trace_hardirqs_on <-run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1384) <idle>-0 0dNs3 25us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1385) => _raw_spin_unlock_irq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1386) => run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1387) => __do_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1388) => call_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1389) => do_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1390) => irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1391) => smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1392) => apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1393) => rcu_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1394) => cpu_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1395) => rest_init
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1396) => start_kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1397) => x86_64_start_reservations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1398) => x86_64_start_kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1399)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1400) Here we see that we had a latency of 16 microseconds (which is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1401) very good). The _raw_spin_lock_irq in run_timer_softirq disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1402) interrupts. The difference between the 16 and the displayed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1403) timestamp 25us occurred because the clock was incremented
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1404) between the time of recording the max latency and the time of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1405) recording the function that had that latency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1406)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1407) Note the above example had function-trace not set. If we set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1408) function-trace, we get a much larger output::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1409)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1410) with echo 1 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1411)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1412) # tracer: irqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1413) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1414) # irqsoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1415) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1416) # latency: 71 us, #168/168, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1417) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1418) # | task: bash-2042 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1419) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1420) # => started at: ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1421) # => ended at: ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1422) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1423) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1424) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1425) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1426) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1427) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1428) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1429) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1430) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1431) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1432) bash-2042 3d... 0us : _raw_spin_lock_irqsave <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1433) bash-2042 3d... 0us : add_preempt_count <-_raw_spin_lock_irqsave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1434) bash-2042 3d..1 1us : ata_scsi_find_dev <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1435) bash-2042 3d..1 1us : __ata_scsi_find_dev <-ata_scsi_find_dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1436) bash-2042 3d..1 2us : ata_find_dev.part.14 <-__ata_scsi_find_dev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1437) bash-2042 3d..1 2us : ata_qc_new_init <-__ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1438) bash-2042 3d..1 3us : ata_sg_init <-__ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1439) bash-2042 3d..1 4us : ata_scsi_rw_xlat <-__ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1440) bash-2042 3d..1 4us : ata_build_rw_tf <-ata_scsi_rw_xlat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1441) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1442) bash-2042 3d..1 67us : delay_tsc <-__delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1443) bash-2042 3d..1 67us : add_preempt_count <-delay_tsc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1444) bash-2042 3d..2 67us : sub_preempt_count <-delay_tsc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1445) bash-2042 3d..1 67us : add_preempt_count <-delay_tsc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1446) bash-2042 3d..2 68us : sub_preempt_count <-delay_tsc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1447) bash-2042 3d..1 68us+: ata_bmdma_start <-ata_bmdma_qc_issue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1448) bash-2042 3d..1 71us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1449) bash-2042 3d..1 71us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1450) bash-2042 3d..1 72us+: trace_hardirqs_on <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1451) bash-2042 3d..1 120us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1452) => _raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1453) => ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1454) => scsi_dispatch_cmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1455) => scsi_request_fn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1456) => __blk_run_queue_uncond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1457) => __blk_run_queue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1458) => blk_queue_bio
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1459) => submit_bio_noacct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1460) => submit_bio
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1461) => submit_bh
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1462) => __ext3_get_inode_loc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1463) => ext3_iget
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1464) => ext3_lookup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1465) => lookup_real
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1466) => __lookup_hash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1467) => walk_component
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1468) => lookup_last
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1469) => path_lookupat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1470) => filename_lookup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1471) => user_path_at_empty
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1472) => user_path_at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1473) => vfs_fstatat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1474) => vfs_stat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1475) => sys_newstat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1476) => system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1477)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1478)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1479) Here we traced a 71 microsecond latency. But we also see all the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1480) functions that were called during that time. Note that by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1481) enabling function tracing, we incur an added overhead. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1482) overhead may extend the latency times. But nevertheless, this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1483) trace has provided some very helpful debugging information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1484)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1485) If we prefer function graph output instead of function, we can set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1486) display-graph option::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1487)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1488) with echo 1 > options/display-graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1489)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1490) # tracer: irqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1491) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1492) # irqsoff latency trace v1.1.5 on 4.20.0-rc6+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1493) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1494) # latency: 3751 us, #274/274, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1495) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1496) # | task: bash-1507 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1497) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1498) # => started at: free_debug_processing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1499) # => ended at: return_to_handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1500) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1501) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1502) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1503) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1504) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1505) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1506) # ||| /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1507) # REL TIME CPU TASK/PID |||| DURATION FUNCTION CALLS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1508) # | | | | |||| | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1509) 0 us | 0) bash-1507 | d... | 0.000 us | _raw_spin_lock_irqsave();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1510) 0 us | 0) bash-1507 | d..1 | 0.378 us | do_raw_spin_trylock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1511) 1 us | 0) bash-1507 | d..2 | | set_track() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1512) 2 us | 0) bash-1507 | d..2 | | save_stack_trace() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1513) 2 us | 0) bash-1507 | d..2 | | __save_stack_trace() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1514) 3 us | 0) bash-1507 | d..2 | | __unwind_start() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1515) 3 us | 0) bash-1507 | d..2 | | get_stack_info() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1516) 3 us | 0) bash-1507 | d..2 | 0.351 us | in_task_stack();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1517) 4 us | 0) bash-1507 | d..2 | 1.107 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1518) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1519) 3750 us | 0) bash-1507 | d..1 | 0.516 us | do_raw_spin_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1520) 3750 us | 0) bash-1507 | d..1 | 0.000 us | _raw_spin_unlock_irqrestore();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1521) 3764 us | 0) bash-1507 | d..1 | 0.000 us | tracer_hardirqs_on();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1522) bash-1507 0d..1 3792us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1523) => free_debug_processing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1524) => __slab_free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1525) => kmem_cache_free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1526) => vm_area_free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1527) => remove_vma
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1528) => exit_mmap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1529) => mmput
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1530) => begin_new_exec
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1531) => load_elf_binary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1532) => search_binary_handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1533) => __do_execve_file.isra.32
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1534) => __x64_sys_execve
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1535) => do_syscall_64
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1536) => entry_SYSCALL_64_after_hwframe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1537)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1538) preemptoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1539) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1540)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1541) When preemption is disabled, we may be able to receive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1542) interrupts but the task cannot be preempted and a higher
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1543) priority task must wait for preemption to be enabled again
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1544) before it can preempt a lower priority task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1545)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1546) The preemptoff tracer traces the places that disable preemption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1547) Like the irqsoff tracer, it records the maximum latency for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1548) which preemption was disabled. The control of preemptoff tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1549) is much like the irqsoff tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1550) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1551)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1552) # echo 0 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1553) # echo preemptoff > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1554) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1555) # echo 0 > tracing_max_latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1556) # ls -ltr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1557) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1558) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1559) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1560) # tracer: preemptoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1561) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1562) # preemptoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1563) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1564) # latency: 46 us, #4/4, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1565) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1566) # | task: sshd-1991 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1567) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1568) # => started at: do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1569) # => ended at: do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1570) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1571) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1572) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1573) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1574) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1575) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1576) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1577) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1578) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1579) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1580) sshd-1991 1d.h. 0us+: irq_enter <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1581) sshd-1991 1d..1 46us : irq_exit <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1582) sshd-1991 1d..1 47us+: trace_preempt_on <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1583) sshd-1991 1d..1 52us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1584) => sub_preempt_count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1585) => irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1586) => do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1587) => ret_from_intr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1588)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1589)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1590) This has some more changes. Preemption was disabled when an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1591) interrupt came in (notice the 'h'), and was enabled on exit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1592) But we also see that interrupts have been disabled when entering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1593) the preempt off section and leaving it (the 'd'). We do not know if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1594) interrupts were enabled in the mean time or shortly after this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1595) was over.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1596) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1597)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1598) # tracer: preemptoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1599) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1600) # preemptoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1601) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1602) # latency: 83 us, #241/241, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1603) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1604) # | task: bash-1994 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1605) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1606) # => started at: wake_up_new_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1607) # => ended at: task_rq_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1608) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1609) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1610) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1611) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1612) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1613) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1614) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1615) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1616) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1617) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1618) bash-1994 1d..1 0us : _raw_spin_lock_irqsave <-wake_up_new_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1619) bash-1994 1d..1 0us : select_task_rq_fair <-select_task_rq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1620) bash-1994 1d..1 1us : __rcu_read_lock <-select_task_rq_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1621) bash-1994 1d..1 1us : source_load <-select_task_rq_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1622) bash-1994 1d..1 1us : source_load <-select_task_rq_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1623) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1624) bash-1994 1d..1 12us : irq_enter <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1625) bash-1994 1d..1 12us : rcu_irq_enter <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1626) bash-1994 1d..1 13us : add_preempt_count <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1627) bash-1994 1d.h1 13us : exit_idle <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1628) bash-1994 1d.h1 13us : hrtimer_interrupt <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1629) bash-1994 1d.h1 13us : _raw_spin_lock <-hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1630) bash-1994 1d.h1 14us : add_preempt_count <-_raw_spin_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1631) bash-1994 1d.h2 14us : ktime_get_update_offsets <-hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1632) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1633) bash-1994 1d.h1 35us : lapic_next_event <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1634) bash-1994 1d.h1 35us : irq_exit <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1635) bash-1994 1d.h1 36us : sub_preempt_count <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1636) bash-1994 1d..2 36us : do_softirq <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1637) bash-1994 1d..2 36us : __do_softirq <-call_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1638) bash-1994 1d..2 36us : __local_bh_disable <-__do_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1639) bash-1994 1d.s2 37us : add_preempt_count <-_raw_spin_lock_irq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1640) bash-1994 1d.s3 38us : _raw_spin_unlock <-run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1641) bash-1994 1d.s3 39us : sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1642) bash-1994 1d.s2 39us : call_timer_fn <-run_timer_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1643) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1644) bash-1994 1dNs2 81us : cpu_needs_another_gp <-rcu_process_callbacks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1645) bash-1994 1dNs2 82us : __local_bh_enable <-__do_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1646) bash-1994 1dNs2 82us : sub_preempt_count <-__local_bh_enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1647) bash-1994 1dN.2 82us : idle_cpu <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1648) bash-1994 1dN.2 83us : rcu_irq_exit <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1649) bash-1994 1dN.2 83us : sub_preempt_count <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1650) bash-1994 1.N.1 84us : _raw_spin_unlock_irqrestore <-task_rq_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1651) bash-1994 1.N.1 84us+: trace_preempt_on <-task_rq_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1652) bash-1994 1.N.1 104us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1653) => sub_preempt_count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1654) => _raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1655) => task_rq_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1656) => wake_up_new_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1657) => do_fork
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1658) => sys_clone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1659) => stub_clone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1660)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1661)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1662) The above is an example of the preemptoff trace with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1663) function-trace set. Here we see that interrupts were not disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1664) the entire time. The irq_enter code lets us know that we entered
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1665) an interrupt 'h'. Before that, the functions being traced still
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1666) show that it is not in an interrupt, but we can see from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1667) functions themselves that this is not the case.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1668)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1669) preemptirqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1670) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1671)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1672) Knowing the locations that have interrupts disabled or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1673) preemption disabled for the longest times is helpful. But
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1674) sometimes we would like to know when either preemption and/or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1675) interrupts are disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1676)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1677) Consider the following code::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1678)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1679) local_irq_disable();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1680) call_function_with_irqs_off();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1681) preempt_disable();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1682) call_function_with_irqs_and_preemption_off();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1683) local_irq_enable();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1684) call_function_with_preemption_off();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1685) preempt_enable();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1686)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1687) The irqsoff tracer will record the total length of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1688) call_function_with_irqs_off() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1689) call_function_with_irqs_and_preemption_off().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1690)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1691) The preemptoff tracer will record the total length of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1692) call_function_with_irqs_and_preemption_off() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1693) call_function_with_preemption_off().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1694)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1695) But neither will trace the time that interrupts and/or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1696) preemption is disabled. This total time is the time that we can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1697) not schedule. To record this time, use the preemptirqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1698) tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1699)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1700) Again, using this trace is much like the irqsoff and preemptoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1701) tracers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1702) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1703)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1704) # echo 0 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1705) # echo preemptirqsoff > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1706) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1707) # echo 0 > tracing_max_latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1708) # ls -ltr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1709) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1710) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1711) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1712) # tracer: preemptirqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1713) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1714) # preemptirqsoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1715) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1716) # latency: 100 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1717) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1718) # | task: ls-2230 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1719) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1720) # => started at: ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1721) # => ended at: ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1722) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1723) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1724) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1725) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1726) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1727) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1728) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1729) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1730) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1731) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1732) ls-2230 3d... 0us+: _raw_spin_lock_irqsave <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1733) ls-2230 3...1 100us : _raw_spin_unlock_irqrestore <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1734) ls-2230 3...1 101us+: trace_preempt_on <-ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1735) ls-2230 3...1 111us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1736) => sub_preempt_count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1737) => _raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1738) => ata_scsi_queuecmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1739) => scsi_dispatch_cmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1740) => scsi_request_fn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1741) => __blk_run_queue_uncond
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1742) => __blk_run_queue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1743) => blk_queue_bio
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1744) => submit_bio_noacct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1745) => submit_bio
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1746) => submit_bh
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1747) => ext3_bread
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1748) => ext3_dir_bread
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1749) => htree_dirblock_to_tree
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1750) => ext3_htree_fill_tree
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1751) => ext3_readdir
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1752) => vfs_readdir
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1753) => sys_getdents
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1754) => system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1755)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1756)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1757) The trace_hardirqs_off_thunk is called from assembly on x86 when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1758) interrupts are disabled in the assembly code. Without the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1759) function tracing, we do not know if interrupts were enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1760) within the preemption points. We do see that it started with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1761) preemption enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1762)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1763) Here is a trace with function-trace set::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1764)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1765) # tracer: preemptirqsoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1766) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1767) # preemptirqsoff latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1768) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1769) # latency: 161 us, #339/339, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1770) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1771) # | task: ls-2269 (uid:0 nice:0 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1772) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1773) # => started at: schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1774) # => ended at: mutex_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1775) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1776) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1777) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1778) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1779) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1780) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1781) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1782) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1783) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1784) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1785) kworker/-59 3...1 0us : __schedule <-schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1786) kworker/-59 3d..1 0us : rcu_preempt_qs <-rcu_note_context_switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1787) kworker/-59 3d..1 1us : add_preempt_count <-_raw_spin_lock_irq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1788) kworker/-59 3d..2 1us : deactivate_task <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1789) kworker/-59 3d..2 1us : dequeue_task <-deactivate_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1790) kworker/-59 3d..2 2us : update_rq_clock <-dequeue_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1791) kworker/-59 3d..2 2us : dequeue_task_fair <-dequeue_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1792) kworker/-59 3d..2 2us : update_curr <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1793) kworker/-59 3d..2 2us : update_min_vruntime <-update_curr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1794) kworker/-59 3d..2 3us : cpuacct_charge <-update_curr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1795) kworker/-59 3d..2 3us : __rcu_read_lock <-cpuacct_charge
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1796) kworker/-59 3d..2 3us : __rcu_read_unlock <-cpuacct_charge
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1797) kworker/-59 3d..2 3us : update_cfs_rq_blocked_load <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1798) kworker/-59 3d..2 4us : clear_buddies <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1799) kworker/-59 3d..2 4us : account_entity_dequeue <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1800) kworker/-59 3d..2 4us : update_min_vruntime <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1801) kworker/-59 3d..2 4us : update_cfs_shares <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1802) kworker/-59 3d..2 5us : hrtick_update <-dequeue_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1803) kworker/-59 3d..2 5us : wq_worker_sleeping <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1804) kworker/-59 3d..2 5us : kthread_data <-wq_worker_sleeping
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1805) kworker/-59 3d..2 5us : put_prev_task_fair <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1806) kworker/-59 3d..2 6us : pick_next_task_fair <-pick_next_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1807) kworker/-59 3d..2 6us : clear_buddies <-pick_next_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1808) kworker/-59 3d..2 6us : set_next_entity <-pick_next_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1809) kworker/-59 3d..2 6us : update_stats_wait_end <-set_next_entity
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1810) ls-2269 3d..2 7us : finish_task_switch <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1811) ls-2269 3d..2 7us : _raw_spin_unlock_irq <-finish_task_switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1812) ls-2269 3d..2 8us : do_IRQ <-ret_from_intr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1813) ls-2269 3d..2 8us : irq_enter <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1814) ls-2269 3d..2 8us : rcu_irq_enter <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1815) ls-2269 3d..2 9us : add_preempt_count <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1816) ls-2269 3d.h2 9us : exit_idle <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1817) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1818) ls-2269 3d.h3 20us : sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1819) ls-2269 3d.h2 20us : irq_exit <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1820) ls-2269 3d.h2 21us : sub_preempt_count <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1821) ls-2269 3d..3 21us : do_softirq <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1822) ls-2269 3d..3 21us : __do_softirq <-call_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1823) ls-2269 3d..3 21us+: __local_bh_disable <-__do_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1824) ls-2269 3d.s4 29us : sub_preempt_count <-_local_bh_enable_ip
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1825) ls-2269 3d.s5 29us : sub_preempt_count <-_local_bh_enable_ip
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1826) ls-2269 3d.s5 31us : do_IRQ <-ret_from_intr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1827) ls-2269 3d.s5 31us : irq_enter <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1828) ls-2269 3d.s5 31us : rcu_irq_enter <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1829) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1830) ls-2269 3d.s5 31us : rcu_irq_enter <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1831) ls-2269 3d.s5 32us : add_preempt_count <-irq_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1832) ls-2269 3d.H5 32us : exit_idle <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1833) ls-2269 3d.H5 32us : handle_irq <-do_IRQ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1834) ls-2269 3d.H5 32us : irq_to_desc <-handle_irq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1835) ls-2269 3d.H5 33us : handle_fasteoi_irq <-handle_irq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1836) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1837) ls-2269 3d.s5 158us : _raw_spin_unlock_irqrestore <-rtl8139_poll
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1838) ls-2269 3d.s3 158us : net_rps_action_and_irq_enable.isra.65 <-net_rx_action
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1839) ls-2269 3d.s3 159us : __local_bh_enable <-__do_softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1840) ls-2269 3d.s3 159us : sub_preempt_count <-__local_bh_enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1841) ls-2269 3d..3 159us : idle_cpu <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1842) ls-2269 3d..3 159us : rcu_irq_exit <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1843) ls-2269 3d..3 160us : sub_preempt_count <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1844) ls-2269 3d... 161us : __mutex_unlock_slowpath <-mutex_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1845) ls-2269 3d... 162us+: trace_hardirqs_on <-mutex_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1846) ls-2269 3d... 186us : <stack trace>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1847) => __mutex_unlock_slowpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1848) => mutex_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1849) => process_output
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1850) => n_tty_write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1851) => tty_write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1852) => vfs_write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1853) => sys_write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1854) => system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1855)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1856) This is an interesting trace. It started with kworker running and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1857) scheduling out and ls taking over. But as soon as ls released the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1858) rq lock and enabled interrupts (but not preemption) an interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1859) triggered. When the interrupt finished, it started running softirqs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1860) But while the softirq was running, another interrupt triggered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1861) When an interrupt is running inside a softirq, the annotation is 'H'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1862)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1863)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1864) wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1865) ------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1866)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1867) One common case that people are interested in tracing is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1868) time it takes for a task that is woken to actually wake up.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1869) Now for non Real-Time tasks, this can be arbitrary. But tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1870) it none the less can be interesting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1871)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1872) Without function tracing::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1873)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1874) # echo 0 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1875) # echo wakeup > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1876) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1877) # echo 0 > tracing_max_latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1878) # chrt -f 5 sleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1879) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1880) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1881) # tracer: wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1882) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1883) # wakeup latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1884) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1885) # latency: 15 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1886) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1887) # | task: kworker/3:1H-312 (uid:0 nice:-20 policy:0 rt_prio:0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1888) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1889) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1890) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1891) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1892) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1893) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1894) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1895) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1896) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1897) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1898) <idle>-0 3dNs7 0us : 0:120:R + [003] 312:100:R kworker/3:1H
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1899) <idle>-0 3dNs7 1us+: ttwu_do_activate.constprop.87 <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1900) <idle>-0 3d..3 15us : __schedule <-schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1901) <idle>-0 3d..3 15us : 0:120:R ==> [003] 312:100:R kworker/3:1H
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1902)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1903) The tracer only traces the highest priority task in the system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1904) to avoid tracing the normal circumstances. Here we see that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1905) the kworker with a nice priority of -20 (not very nice), took
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1906) just 15 microseconds from the time it woke up, to the time it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1907) ran.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1908)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1909) Non Real-Time tasks are not that interesting. A more interesting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1910) trace is to concentrate only on Real-Time tasks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1911)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1912) wakeup_rt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1913) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1914)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1915) In a Real-Time environment it is very important to know the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1916) wakeup time it takes for the highest priority task that is woken
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1917) up to the time that it executes. This is also known as "schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1918) latency". I stress the point that this is about RT tasks. It is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1919) also important to know the scheduling latency of non-RT tasks,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1920) but the average schedule latency is better for non-RT tasks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1921) Tools like LatencyTop are more appropriate for such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1922) measurements.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1923)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1924) Real-Time environments are interested in the worst case latency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1925) That is the longest latency it takes for something to happen,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1926) and not the average. We can have a very fast scheduler that may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1927) only have a large latency once in a while, but that would not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1928) work well with Real-Time tasks. The wakeup_rt tracer was designed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1929) to record the worst case wakeups of RT tasks. Non-RT tasks are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1930) not recorded because the tracer only records one worst case and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1931) tracing non-RT tasks that are unpredictable will overwrite the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1932) worst case latency of RT tasks (just run the normal wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1933) tracer for a while to see that effect).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1934)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1935) Since this tracer only deals with RT tasks, we will run this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1936) slightly differently than we did with the previous tracers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1937) Instead of performing an 'ls', we will run 'sleep 1' under
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1938) 'chrt' which changes the priority of the task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1939) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1940)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1941) # echo 0 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1942) # echo wakeup_rt > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1943) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1944) # echo 0 > tracing_max_latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1945) # chrt -f 5 sleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1946) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1947) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1948) # tracer: wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1949) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1950) # tracer: wakeup_rt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1951) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1952) # wakeup_rt latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1953) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1954) # latency: 5 us, #4/4, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1955) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1956) # | task: sleep-2389 (uid:0 nice:0 policy:1 rt_prio:5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1957) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1958) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1959) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1960) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1961) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1962) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1963) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1964) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1965) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1966) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1967) <idle>-0 3d.h4 0us : 0:120:R + [003] 2389: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1968) <idle>-0 3d.h4 1us+: ttwu_do_activate.constprop.87 <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1969) <idle>-0 3d..3 5us : __schedule <-schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1970) <idle>-0 3d..3 5us : 0:120:R ==> [003] 2389: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1971)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1972)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1973) Running this on an idle system, we see that it only took 5 microseconds
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1974) to perform the task switch. Note, since the trace point in the schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1975) is before the actual "switch", we stop the tracing when the recorded task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1976) is about to schedule in. This may change if we add a new marker at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1977) end of the scheduler.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1978)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1979) Notice that the recorded task is 'sleep' with the PID of 2389
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1980) and it has an rt_prio of 5. This priority is user-space priority
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1981) and not the internal kernel priority. The policy is 1 for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1982) SCHED_FIFO and 2 for SCHED_RR.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1983)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1984) Note, that the trace data shows the internal priority (99 - rtprio).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1985) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1986)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1987) <idle>-0 3d..3 5us : 0:120:R ==> [003] 2389: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1988)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1989) The 0:120:R means idle was running with a nice priority of 0 (120 - 120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1990) and in the running state 'R'. The sleep task was scheduled in with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1991) 2389: 94:R. That is the priority is the kernel rtprio (99 - 5 = 94)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1992) and it too is in the running state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1993)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1994) Doing the same with chrt -r 5 and function-trace set.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1995) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1996)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1997) echo 1 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1998)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1999) # tracer: wakeup_rt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2000) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2001) # wakeup_rt latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2002) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2003) # latency: 29 us, #85/85, CPU#3 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2004) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2005) # | task: sleep-2448 (uid:0 nice:0 policy:1 rt_prio:5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2006) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2007) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2008) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2009) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2010) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2011) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2012) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2013) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2014) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2015) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2016) <idle>-0 3d.h4 1us+: 0:120:R + [003] 2448: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2017) <idle>-0 3d.h4 2us : ttwu_do_activate.constprop.87 <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2018) <idle>-0 3d.h3 3us : check_preempt_curr <-ttwu_do_wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2019) <idle>-0 3d.h3 3us : resched_curr <-check_preempt_curr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2020) <idle>-0 3dNh3 4us : task_woken_rt <-ttwu_do_wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2021) <idle>-0 3dNh3 4us : _raw_spin_unlock <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2022) <idle>-0 3dNh3 4us : sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2023) <idle>-0 3dNh2 5us : ttwu_stat <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2024) <idle>-0 3dNh2 5us : _raw_spin_unlock_irqrestore <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2025) <idle>-0 3dNh2 6us : sub_preempt_count <-_raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2026) <idle>-0 3dNh1 6us : _raw_spin_lock <-__run_hrtimer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2027) <idle>-0 3dNh1 6us : add_preempt_count <-_raw_spin_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2028) <idle>-0 3dNh2 7us : _raw_spin_unlock <-hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2029) <idle>-0 3dNh2 7us : sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2030) <idle>-0 3dNh1 7us : tick_program_event <-hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2031) <idle>-0 3dNh1 7us : clockevents_program_event <-tick_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2032) <idle>-0 3dNh1 8us : ktime_get <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2033) <idle>-0 3dNh1 8us : lapic_next_event <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2034) <idle>-0 3dNh1 8us : irq_exit <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2035) <idle>-0 3dNh1 9us : sub_preempt_count <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2036) <idle>-0 3dN.2 9us : idle_cpu <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2037) <idle>-0 3dN.2 9us : rcu_irq_exit <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2038) <idle>-0 3dN.2 10us : rcu_eqs_enter_common.isra.45 <-rcu_irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2039) <idle>-0 3dN.2 10us : sub_preempt_count <-irq_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2040) <idle>-0 3.N.1 11us : rcu_idle_exit <-cpu_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2041) <idle>-0 3dN.1 11us : rcu_eqs_exit_common.isra.43 <-rcu_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2042) <idle>-0 3.N.1 11us : tick_nohz_idle_exit <-cpu_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2043) <idle>-0 3dN.1 12us : menu_hrtimer_cancel <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2044) <idle>-0 3dN.1 12us : ktime_get <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2045) <idle>-0 3dN.1 12us : tick_do_update_jiffies64 <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2046) <idle>-0 3dN.1 13us : cpu_load_update_nohz <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2047) <idle>-0 3dN.1 13us : _raw_spin_lock <-cpu_load_update_nohz
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2048) <idle>-0 3dN.1 13us : add_preempt_count <-_raw_spin_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2049) <idle>-0 3dN.2 13us : __cpu_load_update <-cpu_load_update_nohz
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2050) <idle>-0 3dN.2 14us : sched_avg_update <-__cpu_load_update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2051) <idle>-0 3dN.2 14us : _raw_spin_unlock <-cpu_load_update_nohz
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2052) <idle>-0 3dN.2 14us : sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2053) <idle>-0 3dN.1 15us : calc_load_nohz_stop <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2054) <idle>-0 3dN.1 15us : touch_softlockup_watchdog <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2055) <idle>-0 3dN.1 15us : hrtimer_cancel <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2056) <idle>-0 3dN.1 15us : hrtimer_try_to_cancel <-hrtimer_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2057) <idle>-0 3dN.1 16us : lock_hrtimer_base.isra.18 <-hrtimer_try_to_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2058) <idle>-0 3dN.1 16us : _raw_spin_lock_irqsave <-lock_hrtimer_base.isra.18
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2059) <idle>-0 3dN.1 16us : add_preempt_count <-_raw_spin_lock_irqsave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2060) <idle>-0 3dN.2 17us : __remove_hrtimer <-remove_hrtimer.part.16
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2061) <idle>-0 3dN.2 17us : hrtimer_force_reprogram <-__remove_hrtimer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2062) <idle>-0 3dN.2 17us : tick_program_event <-hrtimer_force_reprogram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2063) <idle>-0 3dN.2 18us : clockevents_program_event <-tick_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2064) <idle>-0 3dN.2 18us : ktime_get <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2065) <idle>-0 3dN.2 18us : lapic_next_event <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2066) <idle>-0 3dN.2 19us : _raw_spin_unlock_irqrestore <-hrtimer_try_to_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2067) <idle>-0 3dN.2 19us : sub_preempt_count <-_raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2068) <idle>-0 3dN.1 19us : hrtimer_forward <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2069) <idle>-0 3dN.1 20us : ktime_add_safe <-hrtimer_forward
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2070) <idle>-0 3dN.1 20us : ktime_add_safe <-hrtimer_forward
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2071) <idle>-0 3dN.1 20us : hrtimer_start_range_ns <-hrtimer_start_expires.constprop.11
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2072) <idle>-0 3dN.1 20us : __hrtimer_start_range_ns <-hrtimer_start_range_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2073) <idle>-0 3dN.1 21us : lock_hrtimer_base.isra.18 <-__hrtimer_start_range_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2074) <idle>-0 3dN.1 21us : _raw_spin_lock_irqsave <-lock_hrtimer_base.isra.18
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2075) <idle>-0 3dN.1 21us : add_preempt_count <-_raw_spin_lock_irqsave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2076) <idle>-0 3dN.2 22us : ktime_add_safe <-__hrtimer_start_range_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2077) <idle>-0 3dN.2 22us : enqueue_hrtimer <-__hrtimer_start_range_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2078) <idle>-0 3dN.2 22us : tick_program_event <-__hrtimer_start_range_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2079) <idle>-0 3dN.2 23us : clockevents_program_event <-tick_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2080) <idle>-0 3dN.2 23us : ktime_get <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2081) <idle>-0 3dN.2 23us : lapic_next_event <-clockevents_program_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2082) <idle>-0 3dN.2 24us : _raw_spin_unlock_irqrestore <-__hrtimer_start_range_ns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2083) <idle>-0 3dN.2 24us : sub_preempt_count <-_raw_spin_unlock_irqrestore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2084) <idle>-0 3dN.1 24us : account_idle_ticks <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2085) <idle>-0 3dN.1 24us : account_idle_time <-account_idle_ticks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2086) <idle>-0 3.N.1 25us : sub_preempt_count <-cpu_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2087) <idle>-0 3.N.. 25us : schedule <-cpu_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2088) <idle>-0 3.N.. 25us : __schedule <-preempt_schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2089) <idle>-0 3.N.. 26us : add_preempt_count <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2090) <idle>-0 3.N.1 26us : rcu_note_context_switch <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2091) <idle>-0 3.N.1 26us : rcu_sched_qs <-rcu_note_context_switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2092) <idle>-0 3dN.1 27us : rcu_preempt_qs <-rcu_note_context_switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2093) <idle>-0 3.N.1 27us : _raw_spin_lock_irq <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2094) <idle>-0 3dN.1 27us : add_preempt_count <-_raw_spin_lock_irq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2095) <idle>-0 3dN.2 28us : put_prev_task_idle <-__schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2096) <idle>-0 3dN.2 28us : pick_next_task_stop <-pick_next_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2097) <idle>-0 3dN.2 28us : pick_next_task_rt <-pick_next_task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2098) <idle>-0 3dN.2 29us : dequeue_pushable_task <-pick_next_task_rt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2099) <idle>-0 3d..3 29us : __schedule <-preempt_schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2100) <idle>-0 3d..3 30us : 0:120:R ==> [003] 2448: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2101)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2102) This isn't that big of a trace, even with function tracing enabled,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2103) so I included the entire trace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2104)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2105) The interrupt went off while when the system was idle. Somewhere
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2106) before task_woken_rt() was called, the NEED_RESCHED flag was set,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2107) this is indicated by the first occurrence of the 'N' flag.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2108)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2109) Latency tracing and events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2110) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2111) As function tracing can induce a much larger latency, but without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2112) seeing what happens within the latency it is hard to know what
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2113) caused it. There is a middle ground, and that is with enabling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2114) events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2115) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2117) # echo 0 > options/function-trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2118) # echo wakeup_rt > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2119) # echo 1 > events/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2120) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2121) # echo 0 > tracing_max_latency
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2122) # chrt -f 5 sleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2123) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2124) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2125) # tracer: wakeup_rt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2126) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2127) # wakeup_rt latency trace v1.1.5 on 3.8.0-test+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2128) # --------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2129) # latency: 6 us, #12/12, CPU#2 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2130) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2131) # | task: sleep-5882 (uid:0 nice:0 policy:1 rt_prio:5)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2132) # -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2133) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2134) # _------=> CPU#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2135) # / _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2136) # | / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2137) # || / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2138) # ||| / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2139) # |||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2140) # cmd pid ||||| time | caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2141) # \ / ||||| \ | /
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2142) <idle>-0 2d.h4 0us : 0:120:R + [002] 5882: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2143) <idle>-0 2d.h4 0us : ttwu_do_activate.constprop.87 <-try_to_wake_up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2144) <idle>-0 2d.h4 1us : sched_wakeup: comm=sleep pid=5882 prio=94 success=1 target_cpu=002
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2145) <idle>-0 2dNh2 1us : hrtimer_expire_exit: hrtimer=ffff88007796feb8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2146) <idle>-0 2.N.2 2us : power_end: cpu_id=2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2147) <idle>-0 2.N.2 3us : cpu_idle: state=4294967295 cpu_id=2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2148) <idle>-0 2dN.3 4us : hrtimer_cancel: hrtimer=ffff88007d50d5e0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2149) <idle>-0 2dN.3 4us : hrtimer_start: hrtimer=ffff88007d50d5e0 function=tick_sched_timer expires=34311211000000 softexpires=34311211000000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2150) <idle>-0 2.N.2 5us : rcu_utilization: Start context switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2151) <idle>-0 2.N.2 5us : rcu_utilization: End context switch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2152) <idle>-0 2d..3 6us : __schedule <-schedule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2153) <idle>-0 2d..3 6us : 0:120:R ==> [002] 5882: 94:R sleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2154)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2156) Hardware Latency Detector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2157) -------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2159) The hardware latency detector is executed by enabling the "hwlat" tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2160)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2161) NOTE, this tracer will affect the performance of the system as it will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2162) periodically make a CPU constantly busy with interrupts disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2163) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2164)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2165) # echo hwlat > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2166) # sleep 100
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2167) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2168) # tracer: hwlat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2169) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2170) # entries-in-buffer/entries-written: 13/13 #P:8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2171) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2172) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2173) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2174) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2175) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2176) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2177) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2178) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2179) <...>-1729 [001] d... 678.473449: #1 inner/outer(us): 11/12 ts:1581527483.343962693 count:6
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2180) <...>-1729 [004] d... 689.556542: #2 inner/outer(us): 16/9 ts:1581527494.889008092 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2181) <...>-1729 [005] d... 714.756290: #3 inner/outer(us): 16/16 ts:1581527519.678961629 count:5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2182) <...>-1729 [001] d... 718.788247: #4 inner/outer(us): 9/17 ts:1581527523.889012713 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2183) <...>-1729 [002] d... 719.796341: #5 inner/outer(us): 13/9 ts:1581527524.912872606 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2184) <...>-1729 [006] d... 844.787091: #6 inner/outer(us): 9/12 ts:1581527649.889048502 count:2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2185) <...>-1729 [003] d... 849.827033: #7 inner/outer(us): 18/9 ts:1581527654.889013793 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2186) <...>-1729 [007] d... 853.859002: #8 inner/outer(us): 9/12 ts:1581527658.889065736 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2187) <...>-1729 [001] d... 855.874978: #9 inner/outer(us): 9/11 ts:1581527660.861991877 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2188) <...>-1729 [001] d... 863.938932: #10 inner/outer(us): 9/11 ts:1581527668.970010500 count:1 nmi-total:7 nmi-count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2189) <...>-1729 [007] d... 878.050780: #11 inner/outer(us): 9/12 ts:1581527683.385002600 count:1 nmi-total:5 nmi-count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2190) <...>-1729 [007] d... 886.114702: #12 inner/outer(us): 9/12 ts:1581527691.385001600 count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2192)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2193) The above output is somewhat the same in the header. All events will have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2194) interrupts disabled 'd'. Under the FUNCTION title there is:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2195)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2196) #1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2197) This is the count of events recorded that were greater than the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2198) tracing_threshold (See below).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2199)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2200) inner/outer(us): 11/11
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2201)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2202) This shows two numbers as "inner latency" and "outer latency". The test
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2203) runs in a loop checking a timestamp twice. The latency detected within
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2204) the two timestamps is the "inner latency" and the latency detected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2205) after the previous timestamp and the next timestamp in the loop is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2206) the "outer latency".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2207)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2208) ts:1581527483.343962693
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2209)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2210) The absolute timestamp that the first latency was recorded in the window.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2211)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2212) count:6
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2213)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2214) The number of times a latency was detected during the window.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2215)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2216) nmi-total:7 nmi-count:1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2217)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2218) On architectures that support it, if an NMI comes in during the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2219) test, the time spent in NMI is reported in "nmi-total" (in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2220) microseconds).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2221)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2222) All architectures that have NMIs will show the "nmi-count" if an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2223) NMI comes in during the test.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2224)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2225) hwlat files:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2226)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2227) tracing_threshold
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2228) This gets automatically set to "10" to represent 10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2229) microseconds. This is the threshold of latency that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2230) needs to be detected before the trace will be recorded.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2231)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2232) Note, when hwlat tracer is finished (another tracer is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2233) written into "current_tracer"), the original value for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2234) tracing_threshold is placed back into this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2235)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2236) hwlat_detector/width
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2237) The length of time the test runs with interrupts disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2238)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2239) hwlat_detector/window
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2240) The length of time of the window which the test
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2241) runs. That is, the test will run for "width"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2242) microseconds per "window" microseconds
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2243)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2244) tracing_cpumask
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2245) When the test is started. A kernel thread is created that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2246) runs the test. This thread will alternate between CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2247) listed in the tracing_cpumask between each period
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2248) (one "window"). To limit the test to specific CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2249) set the mask in this file to only the CPUs that the test
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2250) should run on.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2251)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2252) function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2253) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2254)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2255) This tracer is the function tracer. Enabling the function tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2256) can be done from the debug file system. Make sure the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2257) ftrace_enabled is set; otherwise this tracer is a nop.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2258) See the "ftrace_enabled" section below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2259) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2260)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2261) # sysctl kernel.ftrace_enabled=1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2262) # echo function > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2263) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2264) # usleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2265) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2266) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2267) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2268) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2269) # entries-in-buffer/entries-written: 24799/24799 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2270) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2271) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2272) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2273) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2274) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2275) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2276) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2277) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2278) bash-1994 [002] .... 3082.063030: mutex_unlock <-rb_simple_write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2279) bash-1994 [002] .... 3082.063031: __mutex_unlock_slowpath <-mutex_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2280) bash-1994 [002] .... 3082.063031: __fsnotify_parent <-fsnotify_modify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2281) bash-1994 [002] .... 3082.063032: fsnotify <-fsnotify_modify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2282) bash-1994 [002] .... 3082.063032: __srcu_read_lock <-fsnotify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2283) bash-1994 [002] .... 3082.063032: add_preempt_count <-__srcu_read_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2284) bash-1994 [002] ...1 3082.063032: sub_preempt_count <-__srcu_read_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2285) bash-1994 [002] .... 3082.063033: __srcu_read_unlock <-fsnotify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2286) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2287)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2289) Note: function tracer uses ring buffers to store the above
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2290) entries. The newest data may overwrite the oldest data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2291) Sometimes using echo to stop the trace is not sufficient because
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2292) the tracing could have overwritten the data that you wanted to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2293) record. For this reason, it is sometimes better to disable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2294) tracing directly from a program. This allows you to stop the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2295) tracing at the point that you hit the part that you are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2296) interested in. To disable the tracing directly from a C program,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2297) something like following code snippet can be used::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2298)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2299) int trace_fd;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2300) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2301) int main(int argc, char *argv[]) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2302) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2303) trace_fd = open(tracing_file("tracing_on"), O_WRONLY);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2304) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2305) if (condition_hit()) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2306) write(trace_fd, "0", 1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2307) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2308) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2309) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2311)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2312) Single thread tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2313) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2314)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2315) By writing into set_ftrace_pid you can trace a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2316) single thread. For example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2317)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2318) # cat set_ftrace_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2319) no pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2320) # echo 3111 > set_ftrace_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2321) # cat set_ftrace_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2322) 3111
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2323) # echo function > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2324) # cat trace | head
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2325) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2326) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2327) # TASK-PID CPU# TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2328) # | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2329) yum-updatesd-3111 [003] 1637.254676: finish_task_switch <-thread_return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2330) yum-updatesd-3111 [003] 1637.254681: hrtimer_cancel <-schedule_hrtimeout_range
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2331) yum-updatesd-3111 [003] 1637.254682: hrtimer_try_to_cancel <-hrtimer_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2332) yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2333) yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2334) yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2335) # echo > set_ftrace_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2336) # cat trace |head
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2337) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2338) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2339) # TASK-PID CPU# TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2340) # | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2341) ##### CPU 3 buffer started ####
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2342) yum-updatesd-3111 [003] 1701.957688: free_poll_entry <-poll_freewait
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2343) yum-updatesd-3111 [003] 1701.957689: remove_wait_queue <-free_poll_entry
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2344) yum-updatesd-3111 [003] 1701.957691: fput <-free_poll_entry
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2345) yum-updatesd-3111 [003] 1701.957692: audit_syscall_exit <-sysret_audit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2346) yum-updatesd-3111 [003] 1701.957693: path_put <-audit_syscall_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2347)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2348) If you want to trace a function when executing, you could use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2349) something like this simple program.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2350) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2351)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2352) #include <stdio.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2353) #include <stdlib.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2354) #include <sys/types.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2355) #include <sys/stat.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2356) #include <fcntl.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2357) #include <unistd.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2358) #include <string.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2359)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2360) #define _STR(x) #x
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2361) #define STR(x) _STR(x)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2362) #define MAX_PATH 256
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2363)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2364) const char *find_tracefs(void)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2365) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2366) static char tracefs[MAX_PATH+1];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2367) static int tracefs_found;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2368) char type[100];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2369) FILE *fp;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2370)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2371) if (tracefs_found)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2372) return tracefs;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2373)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2374) if ((fp = fopen("/proc/mounts","r")) == NULL) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2375) perror("/proc/mounts");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2376) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2377) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2378)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2379) while (fscanf(fp, "%*s %"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2380) STR(MAX_PATH)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2381) "s %99s %*s %*d %*d\n",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2382) tracefs, type) == 2) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2383) if (strcmp(type, "tracefs") == 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2384) break;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2385) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2386) fclose(fp);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2387)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2388) if (strcmp(type, "tracefs") != 0) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2389) fprintf(stderr, "tracefs not mounted");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2390) return NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2391) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2392)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2393) strcat(tracefs, "/tracing/");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2394) tracefs_found = 1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2395)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2396) return tracefs;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2397) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2398)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2399) const char *tracing_file(const char *file_name)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2400) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2401) static char trace_file[MAX_PATH+1];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2402) snprintf(trace_file, MAX_PATH, "%s/%s", find_tracefs(), file_name);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2403) return trace_file;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2404) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2405)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2406) int main (int argc, char **argv)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2407) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2408) if (argc < 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2409) exit(-1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2410)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2411) if (fork() > 0) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2412) int fd, ffd;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2413) char line[64];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2414) int s;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2415)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2416) ffd = open(tracing_file("current_tracer"), O_WRONLY);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2417) if (ffd < 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2418) exit(-1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2419) write(ffd, "nop", 3);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2420)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2421) fd = open(tracing_file("set_ftrace_pid"), O_WRONLY);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2422) s = sprintf(line, "%d\n", getpid());
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2423) write(fd, line, s);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2424)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2425) write(ffd, "function", 8);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2426)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2427) close(fd);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2428) close(ffd);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2429)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2430) execvp(argv[1], argv+1);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2431) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2432)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2433) return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2434) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2435)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2436) Or this simple script!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2437) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2438)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2439) #!/bin/bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2440)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2441) tracefs=`sed -ne 's/^tracefs \(.*\) tracefs.*/\1/p' /proc/mounts`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2442) echo nop > $tracefs/tracing/current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2443) echo 0 > $tracefs/tracing/tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2444) echo $$ > $tracefs/tracing/set_ftrace_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2445) echo function > $tracefs/tracing/current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2446) echo 1 > $tracefs/tracing/tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2447) exec "$@"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2448)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2449)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2450) function graph tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2451) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2452)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2453) This tracer is similar to the function tracer except that it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2454) probes a function on its entry and its exit. This is done by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2455) using a dynamically allocated stack of return addresses in each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2456) task_struct. On function entry the tracer overwrites the return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2457) address of each function traced to set a custom probe. Thus the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2458) original return address is stored on the stack of return address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2459) in the task_struct.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2460)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2461) Probing on both ends of a function leads to special features
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2462) such as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2463)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2464) - measure of a function's time execution
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2465) - having a reliable call stack to draw function calls graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2466)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2467) This tracer is useful in several situations:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2468)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2469) - you want to find the reason of a strange kernel behavior and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2470) need to see what happens in detail on any areas (or specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2471) ones).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2472)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2473) - you are experiencing weird latencies but it's difficult to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2474) find its origin.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2475)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2476) - you want to find quickly which path is taken by a specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2477) function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2478)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2479) - you just want to peek inside a working kernel and want to see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2480) what happens there.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2481)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2482) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2483)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2484) # tracer: function_graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2485) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2486) # CPU DURATION FUNCTION CALLS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2487) # | | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2488)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2489) 0) | sys_open() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2490) 0) | do_sys_open() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2491) 0) | getname() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2492) 0) | kmem_cache_alloc() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2493) 0) 1.382 us | __might_sleep();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2494) 0) 2.478 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2495) 0) | strncpy_from_user() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2496) 0) | might_fault() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2497) 0) 1.389 us | __might_sleep();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2498) 0) 2.553 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2499) 0) 3.807 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2500) 0) 7.876 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2501) 0) | alloc_fd() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2502) 0) 0.668 us | _spin_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2503) 0) 0.570 us | expand_files();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2504) 0) 0.586 us | _spin_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2505)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2506)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2507) There are several columns that can be dynamically
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2508) enabled/disabled. You can use every combination of options you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2509) want, depending on your needs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2510)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2511) - The cpu number on which the function executed is default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2512) enabled. It is sometimes better to only trace one cpu (see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2513) tracing_cpu_mask file) or you might sometimes see unordered
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2514) function calls while cpu tracing switch.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2515)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2516) - hide: echo nofuncgraph-cpu > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2517) - show: echo funcgraph-cpu > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2518)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2519) - The duration (function's time of execution) is displayed on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2520) the closing bracket line of a function or on the same line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2521) than the current function in case of a leaf one. It is default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2522) enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2523)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2524) - hide: echo nofuncgraph-duration > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2525) - show: echo funcgraph-duration > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2526)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2527) - The overhead field precedes the duration field in case of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2528) reached duration thresholds.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2529)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2530) - hide: echo nofuncgraph-overhead > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2531) - show: echo funcgraph-overhead > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2532) - depends on: funcgraph-duration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2533)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2534) ie::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2535)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2536) 3) # 1837.709 us | } /* __switch_to */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2537) 3) | finish_task_switch() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2538) 3) 0.313 us | _raw_spin_unlock_irq();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2539) 3) 3.177 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2540) 3) # 1889.063 us | } /* __schedule */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2541) 3) ! 140.417 us | } /* __schedule */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2542) 3) # 2034.948 us | } /* schedule */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2543) 3) * 33998.59 us | } /* schedule_preempt_disabled */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2544)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2545) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2546)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2547) 1) 0.260 us | msecs_to_jiffies();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2548) 1) 0.313 us | __rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2549) 1) + 61.770 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2550) 1) + 64.479 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2551) 1) 0.313 us | rcu_bh_qs();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2552) 1) 0.313 us | __local_bh_enable();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2553) 1) ! 217.240 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2554) 1) 0.365 us | idle_cpu();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2555) 1) | rcu_irq_exit() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2556) 1) 0.417 us | rcu_eqs_enter_common.isra.47();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2557) 1) 3.125 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2558) 1) ! 227.812 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2559) 1) ! 457.395 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2560) 1) @ 119760.2 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2561)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2562) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2563)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2564) 2) | handle_IPI() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2565) 1) 6.979 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2566) 2) 0.417 us | scheduler_ipi();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2567) 1) 9.791 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2568) 1) + 12.917 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2569) 2) 3.490 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2570) 1) + 15.729 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2571) 1) + 18.542 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2572) 2) $ 3594274 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2573)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2574) Flags::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2575)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2576) + means that the function exceeded 10 usecs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2577) ! means that the function exceeded 100 usecs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2578) # means that the function exceeded 1000 usecs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2579) * means that the function exceeded 10 msecs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2580) @ means that the function exceeded 100 msecs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2581) $ means that the function exceeded 1 sec.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2582)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2583)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2584) - The task/pid field displays the thread cmdline and pid which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2585) executed the function. It is default disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2586)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2587) - hide: echo nofuncgraph-proc > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2588) - show: echo funcgraph-proc > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2589)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2590) ie::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2591)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2592) # tracer: function_graph
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2593) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2594) # CPU TASK/PID DURATION FUNCTION CALLS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2595) # | | | | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2596) 0) sh-4802 | | d_free() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2597) 0) sh-4802 | | call_rcu() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2598) 0) sh-4802 | | __call_rcu() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2599) 0) sh-4802 | 0.616 us | rcu_process_gp_end();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2600) 0) sh-4802 | 0.586 us | check_for_new_grace_period();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2601) 0) sh-4802 | 2.899 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2602) 0) sh-4802 | 4.040 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2603) 0) sh-4802 | 5.151 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2604) 0) sh-4802 | + 49.370 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2605)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2606)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2607) - The absolute time field is an absolute timestamp given by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2608) system clock since it started. A snapshot of this time is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2609) given on each entry/exit of functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2610)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2611) - hide: echo nofuncgraph-abstime > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2612) - show: echo funcgraph-abstime > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2613)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2614) ie::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2615)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2616) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2617) # TIME CPU DURATION FUNCTION CALLS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2618) # | | | | | | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2619) 360.774522 | 1) 0.541 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2620) 360.774522 | 1) 4.663 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2621) 360.774523 | 1) 0.541 us | __wake_up_bit();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2622) 360.774524 | 1) 6.796 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2623) 360.774524 | 1) 7.952 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2624) 360.774525 | 1) 9.063 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2625) 360.774525 | 1) 0.615 us | journal_mark_dirty();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2626) 360.774527 | 1) 0.578 us | __brelse();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2627) 360.774528 | 1) | reiserfs_prepare_for_journal() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2628) 360.774528 | 1) | unlock_buffer() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2629) 360.774529 | 1) | wake_up_bit() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2630) 360.774529 | 1) | bit_waitqueue() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2631) 360.774530 | 1) 0.594 us | __phys_addr();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2632)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2633)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2634) The function name is always displayed after the closing bracket
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2635) for a function if the start of that function is not in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2636) trace buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2637)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2638) Display of the function name after the closing bracket may be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2639) enabled for functions whose start is in the trace buffer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2640) allowing easier searching with grep for function durations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2641) It is default disabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2642)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2643) - hide: echo nofuncgraph-tail > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2644) - show: echo funcgraph-tail > trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2645)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2646) Example with nofuncgraph-tail (default)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2647)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2648) 0) | putname() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2649) 0) | kmem_cache_free() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2650) 0) 0.518 us | __phys_addr();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2651) 0) 1.757 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2652) 0) 2.861 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2653)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2654) Example with funcgraph-tail::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2655)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2656) 0) | putname() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2657) 0) | kmem_cache_free() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2658) 0) 0.518 us | __phys_addr();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2659) 0) 1.757 us | } /* kmem_cache_free() */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2660) 0) 2.861 us | } /* putname() */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2661)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2662) You can put some comments on specific functions by using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2663) trace_printk() For example, if you want to put a comment inside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2664) the __might_sleep() function, you just have to include
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2665) <linux/ftrace.h> and call trace_printk() inside __might_sleep()::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2666)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2667) trace_printk("I'm a comment!\n")
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2668)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2669) will produce::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2670)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2671) 1) | __might_sleep() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2672) 1) | /* I'm a comment! */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2673) 1) 1.449 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2674)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2675)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2676) You might find other useful features for this tracer in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2677) following "dynamic ftrace" section such as tracing only specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2678) functions or tasks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2679)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2680) dynamic ftrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2681) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2682)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2683) If CONFIG_DYNAMIC_FTRACE is set, the system will run with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2684) virtually no overhead when function tracing is disabled. The way
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2685) this works is the mcount function call (placed at the start of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2686) every kernel function, produced by the -pg switch in gcc),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2687) starts of pointing to a simple return. (Enabling FTRACE will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2688) include the -pg switch in the compiling of the kernel.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2689)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2690) At compile time every C file object is run through the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2691) recordmcount program (located in the scripts directory). This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2692) program will parse the ELF headers in the C object to find all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2693) the locations in the .text section that call mcount. Starting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2694) with gcc version 4.6, the -mfentry has been added for x86, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2695) calls "__fentry__" instead of "mcount". Which is called before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2696) the creation of the stack frame.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2697)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2698) Note, not all sections are traced. They may be prevented by either
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2699) a notrace, or blocked another way and all inline functions are not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2700) traced. Check the "available_filter_functions" file to see what functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2701) can be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2702)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2703) A section called "__mcount_loc" is created that holds
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2704) references to all the mcount/fentry call sites in the .text section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2705) The recordmcount program re-links this section back into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2706) original object. The final linking stage of the kernel will add all these
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2707) references into a single table.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2708)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2709) On boot up, before SMP is initialized, the dynamic ftrace code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2710) scans this table and updates all the locations into nops. It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2711) also records the locations, which are added to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2712) available_filter_functions list. Modules are processed as they
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2713) are loaded and before they are executed. When a module is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2714) unloaded, it also removes its functions from the ftrace function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2715) list. This is automatic in the module unload code, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2716) module author does not need to worry about it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2717)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2718) When tracing is enabled, the process of modifying the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2719) tracepoints is dependent on architecture. The old method is to use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2720) kstop_machine to prevent races with the CPUs executing code being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2721) modified (which can cause the CPU to do undesirable things, especially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2722) if the modified code crosses cache (or page) boundaries), and the nops are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2723) patched back to calls. But this time, they do not call mcount
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2724) (which is just a function stub). They now call into the ftrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2725) infrastructure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2726)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2727) The new method of modifying the function tracepoints is to place
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2728) a breakpoint at the location to be modified, sync all CPUs, modify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2729) the rest of the instruction not covered by the breakpoint. Sync
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2730) all CPUs again, and then remove the breakpoint with the finished
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2731) version to the ftrace call site.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2732)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2733) Some archs do not even need to monkey around with the synchronization,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2734) and can just slap the new code on top of the old without any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2735) problems with other CPUs executing it at the same time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2736)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2737) One special side-effect to the recording of the functions being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2738) traced is that we can now selectively choose which functions we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2739) wish to trace and which ones we want the mcount calls to remain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2740) as nops.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2741)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2742) Two files are used, one for enabling and one for disabling the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2743) tracing of specified functions. They are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2744)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2745) set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2746)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2747) and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2748)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2749) set_ftrace_notrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2750)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2751) A list of available functions that you can add to these files is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2752) listed in:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2753)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2754) available_filter_functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2755)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2756) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2757)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2758) # cat available_filter_functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2759) put_prev_task_idle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2760) kmem_cache_create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2761) pick_next_task_rt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2762) get_online_cpus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2763) pick_next_task_fair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2764) mutex_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2765) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2766)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2767) If I am only interested in sys_nanosleep and hrtimer_interrupt::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2768)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2769) # echo sys_nanosleep hrtimer_interrupt > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2770) # echo function > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2771) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2772) # usleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2773) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2774) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2775) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2776) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2777) # entries-in-buffer/entries-written: 5/5 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2778) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2779) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2780) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2781) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2782) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2783) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2784) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2785) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2786) usleep-2665 [001] .... 4186.475355: sys_nanosleep <-system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2787) <idle>-0 [001] d.h1 4186.475409: hrtimer_interrupt <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2788) usleep-2665 [001] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2789) <idle>-0 [003] d.h1 4186.475426: hrtimer_interrupt <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2790) <idle>-0 [002] d.h1 4186.475427: hrtimer_interrupt <-smp_apic_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2791)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2792) To see which functions are being traced, you can cat the file:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2793) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2794)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2795) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2796) hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2797) sys_nanosleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2798)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2799)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2800) Perhaps this is not enough. The filters also allow glob(7) matching.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2801)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2802) ``<match>*``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2803) will match functions that begin with <match>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2804) ``*<match>``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2805) will match functions that end with <match>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2806) ``*<match>*``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2807) will match functions that have <match> in it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2808) ``<match1>*<match2>``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2809) will match functions that begin with <match1> and end with <match2>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2810)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2811) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2812) It is better to use quotes to enclose the wild cards,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2813) otherwise the shell may expand the parameters into names
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2814) of files in the local directory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2815)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2816) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2817)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2818) # echo 'hrtimer_*' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2819)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2820) Produces::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2821)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2822) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2823) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2824) # entries-in-buffer/entries-written: 897/897 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2825) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2826) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2827) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2828) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2829) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2830) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2831) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2832) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2833) <idle>-0 [003] dN.1 4228.547803: hrtimer_cancel <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2834) <idle>-0 [003] dN.1 4228.547804: hrtimer_try_to_cancel <-hrtimer_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2835) <idle>-0 [003] dN.2 4228.547805: hrtimer_force_reprogram <-__remove_hrtimer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2836) <idle>-0 [003] dN.1 4228.547805: hrtimer_forward <-tick_nohz_idle_exit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2837) <idle>-0 [003] dN.1 4228.547805: hrtimer_start_range_ns <-hrtimer_start_expires.constprop.11
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2838) <idle>-0 [003] d..1 4228.547858: hrtimer_get_next_event <-get_next_timer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2839) <idle>-0 [003] d..1 4228.547859: hrtimer_start <-__tick_nohz_idle_enter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2840) <idle>-0 [003] d..2 4228.547860: hrtimer_force_reprogram <-__rem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2841)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2842) Notice that we lost the sys_nanosleep.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2843) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2844)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2845) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2846) hrtimer_run_queues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2847) hrtimer_run_pending
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2848) hrtimer_init
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2849) hrtimer_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2850) hrtimer_try_to_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2851) hrtimer_forward
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2852) hrtimer_start
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2853) hrtimer_reprogram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2854) hrtimer_force_reprogram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2855) hrtimer_get_next_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2856) hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2857) hrtimer_nanosleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2858) hrtimer_wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2859) hrtimer_get_remaining
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2860) hrtimer_get_res
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2861) hrtimer_init_sleeper
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2862)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2863)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2864) This is because the '>' and '>>' act just like they do in bash.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2865) To rewrite the filters, use '>'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2866) To append to the filters, use '>>'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2867)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2868) To clear out a filter so that all functions will be recorded
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2869) again::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2870)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2871) # echo > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2872) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2873) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2874)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2875) Again, now we want to append.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2876)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2877) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2878)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2879) # echo sys_nanosleep > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2880) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2881) sys_nanosleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2882) # echo 'hrtimer_*' >> set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2883) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2884) hrtimer_run_queues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2885) hrtimer_run_pending
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2886) hrtimer_init
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2887) hrtimer_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2888) hrtimer_try_to_cancel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2889) hrtimer_forward
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2890) hrtimer_start
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2891) hrtimer_reprogram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2892) hrtimer_force_reprogram
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2893) hrtimer_get_next_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2894) hrtimer_interrupt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2895) sys_nanosleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2896) hrtimer_nanosleep
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2897) hrtimer_wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2898) hrtimer_get_remaining
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2899) hrtimer_get_res
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2900) hrtimer_init_sleeper
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2901)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2902)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2903) The set_ftrace_notrace prevents those functions from being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2904) traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2905) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2906)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2907) # echo '*preempt*' '*lock*' > set_ftrace_notrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2908)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2909) Produces::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2910)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2911) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2912) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2913) # entries-in-buffer/entries-written: 39608/39608 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2914) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2915) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2916) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2917) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2918) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2919) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2920) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2921) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2922) bash-1994 [000] .... 4342.324896: file_ra_state_init <-do_dentry_open
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2923) bash-1994 [000] .... 4342.324897: open_check_o_direct <-do_last
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2924) bash-1994 [000] .... 4342.324897: ima_file_check <-do_last
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2925) bash-1994 [000] .... 4342.324898: process_measurement <-ima_file_check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2926) bash-1994 [000] .... 4342.324898: ima_get_action <-process_measurement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2927) bash-1994 [000] .... 4342.324898: ima_match_policy <-ima_get_action
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2928) bash-1994 [000] .... 4342.324899: do_truncate <-do_last
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2929) bash-1994 [000] .... 4342.324899: should_remove_suid <-do_truncate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2930) bash-1994 [000] .... 4342.324899: notify_change <-do_truncate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2931) bash-1994 [000] .... 4342.324900: current_fs_time <-notify_change
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2932) bash-1994 [000] .... 4342.324900: current_kernel_time <-current_fs_time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2933) bash-1994 [000] .... 4342.324900: timespec_trunc <-current_fs_time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2934)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2935) We can see that there's no more lock or preempt tracing.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2936)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2937) Selecting function filters via index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2938) ------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2939)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2940) Because processing of strings is expensive (the address of the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2941) needs to be looked up before comparing to the string being passed in),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2942) an index can be used as well to enable functions. This is useful in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2943) case of setting thousands of specific functions at a time. By passing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2944) in a list of numbers, no string processing will occur. Instead, the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2945) at the specific location in the internal array (which corresponds to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2946) functions in the "available_filter_functions" file), is selected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2947)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2948) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2949)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2950) # echo 1 > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2951)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2952) Will select the first function listed in "available_filter_functions"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2953)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2954) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2955)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2956) # head -1 available_filter_functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2957) trace_initcall_finish_cb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2958)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2959) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2960) trace_initcall_finish_cb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2961)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2962) # head -50 available_filter_functions | tail -1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2963) x86_pmu_commit_txn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2964)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2965) # echo 1 50 > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2966) # cat set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2967) trace_initcall_finish_cb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2968) x86_pmu_commit_txn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2969)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2970) Dynamic ftrace with the function graph tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2971) ---------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2972)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2973) Although what has been explained above concerns both the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2974) function tracer and the function-graph-tracer, there are some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2975) special features only available in the function-graph tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2976)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2977) If you want to trace only one function and all of its children,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2978) you just have to echo its name into set_graph_function::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2979)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2980) echo __do_fault > set_graph_function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2981)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2982) will produce the following "expanded" trace of the __do_fault()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2983) function::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2984)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2985) 0) | __do_fault() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2986) 0) | filemap_fault() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2987) 0) | find_lock_page() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2988) 0) 0.804 us | find_get_page();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2989) 0) | __might_sleep() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2990) 0) 1.329 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2991) 0) 3.904 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2992) 0) 4.979 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2993) 0) 0.653 us | _spin_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2994) 0) 0.578 us | page_add_file_rmap();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2995) 0) 0.525 us | native_set_pte_at();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2996) 0) 0.585 us | _spin_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2997) 0) | unlock_page() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2998) 0) 0.541 us | page_waitqueue();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2999) 0) 0.639 us | __wake_up_bit();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3000) 0) 2.786 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3001) 0) + 14.237 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3002) 0) | __do_fault() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3003) 0) | filemap_fault() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3004) 0) | find_lock_page() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3005) 0) 0.698 us | find_get_page();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3006) 0) | __might_sleep() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3007) 0) 1.412 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3008) 0) 3.950 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3009) 0) 5.098 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3010) 0) 0.631 us | _spin_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3011) 0) 0.571 us | page_add_file_rmap();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3012) 0) 0.526 us | native_set_pte_at();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3013) 0) 0.586 us | _spin_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3014) 0) | unlock_page() {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3015) 0) 0.533 us | page_waitqueue();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3016) 0) 0.638 us | __wake_up_bit();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3017) 0) 2.793 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3018) 0) + 14.012 us | }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3019)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3020) You can also expand several functions at once::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3021)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3022) echo sys_open > set_graph_function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3023) echo sys_close >> set_graph_function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3024)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3025) Now if you want to go back to trace all functions you can clear
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3026) this special filter via::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3027)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3028) echo > set_graph_function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3029)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3030)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3031) ftrace_enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3032) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3033)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3034) Note, the proc sysctl ftrace_enable is a big on/off switch for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3035) function tracer. By default it is enabled (when function tracing is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3036) enabled in the kernel). If it is disabled, all function tracing is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3037) disabled. This includes not only the function tracers for ftrace, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3038) also for any other uses (perf, kprobes, stack tracing, profiling, etc). It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3039) cannot be disabled if there is a callback with FTRACE_OPS_FL_PERMANENT set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3040) registered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3041)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3042) Please disable this with care.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3043)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3044) This can be disable (and enabled) with::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3045)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3046) sysctl kernel.ftrace_enabled=0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3047) sysctl kernel.ftrace_enabled=1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3048)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3049) or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3050)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3051) echo 0 > /proc/sys/kernel/ftrace_enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3052) echo 1 > /proc/sys/kernel/ftrace_enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3053)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3054)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3055) Filter commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3056) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3057)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3058) A few commands are supported by the set_ftrace_filter interface.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3059) Trace commands have the following format::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3060)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3061) <function>:<command>:<parameter>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3062)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3063) The following commands are supported:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3064)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3065) - mod:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3066) This command enables function filtering per module. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3067) parameter defines the module. For example, if only the write*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3068) functions in the ext3 module are desired, run:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3069)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3070) echo 'write*:mod:ext3' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3071)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3072) This command interacts with the filter in the same way as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3073) filtering based on function names. Thus, adding more functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3074) in a different module is accomplished by appending (>>) to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3075) filter file. Remove specific module functions by prepending
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3076) '!'::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3077)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3078) echo '!writeback*:mod:ext3' >> set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3079)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3080) Mod command supports module globbing. Disable tracing for all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3081) functions except a specific module::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3082)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3083) echo '!*:mod:!ext3' >> set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3084)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3085) Disable tracing for all modules, but still trace kernel::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3086)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3087) echo '!*:mod:*' >> set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3088)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3089) Enable filter only for kernel::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3090)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3091) echo '*write*:mod:!*' >> set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3092)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3093) Enable filter for module globbing::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3094)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3095) echo '*write*:mod:*snd*' >> set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3096)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3097) - traceon/traceoff:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3098) These commands turn tracing on and off when the specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3099) functions are hit. The parameter determines how many times the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3100) tracing system is turned on and off. If unspecified, there is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3101) no limit. For example, to disable tracing when a schedule bug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3102) is hit the first 5 times, run::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3104) echo '__schedule_bug:traceoff:5' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3106) To always disable tracing when __schedule_bug is hit::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3107)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3108) echo '__schedule_bug:traceoff' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3110) These commands are cumulative whether or not they are appended
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3111) to set_ftrace_filter. To remove a command, prepend it by '!'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3112) and drop the parameter::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3114) echo '!__schedule_bug:traceoff:0' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3116) The above removes the traceoff command for __schedule_bug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3117) that have a counter. To remove commands without counters::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3119) echo '!__schedule_bug:traceoff' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3121) - snapshot:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3122) Will cause a snapshot to be triggered when the function is hit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3123) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3125) echo 'native_flush_tlb_others:snapshot' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3127) To only snapshot once:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3128) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3130) echo 'native_flush_tlb_others:snapshot:1' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3132) To remove the above commands::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3133)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3134) echo '!native_flush_tlb_others:snapshot' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3135) echo '!native_flush_tlb_others:snapshot:0' > set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3137) - enable_event/disable_event:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3138) These commands can enable or disable a trace event. Note, because
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3139) function tracing callbacks are very sensitive, when these commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3140) are registered, the trace point is activated, but disabled in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3141) a "soft" mode. That is, the tracepoint will be called, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3142) just will not be traced. The event tracepoint stays in this mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3143) as long as there's a command that triggers it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3144) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3146) echo 'try_to_wake_up:enable_event:sched:sched_switch:2' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3147) set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3149) The format is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3150)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3151) <function>:enable_event:<system>:<event>[:count]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3152) <function>:disable_event:<system>:<event>[:count]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3153)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3154) To remove the events commands::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3156) echo '!try_to_wake_up:enable_event:sched:sched_switch:0' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3157) set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3158) echo '!schedule:disable_event:sched:sched_switch' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3159) set_ftrace_filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3160)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3161) - dump:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3162) When the function is hit, it will dump the contents of the ftrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3163) ring buffer to the console. This is useful if you need to debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3164) something, and want to dump the trace when a certain function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3165) is hit. Perhaps it's a function that is called before a triple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3166) fault happens and does not allow you to get a regular dump.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3168) - cpudump:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3169) When the function is hit, it will dump the contents of the ftrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3170) ring buffer for the current CPU to the console. Unlike the "dump"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3171) command, it only prints out the contents of the ring buffer for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3172) CPU that executed the function that triggered the dump.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3173)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3174) - stacktrace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3175) When the function is hit, a stack trace is recorded.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3177) trace_pipe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3178) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3180) The trace_pipe outputs the same content as the trace file, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3181) the effect on the tracing is different. Every read from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3182) trace_pipe is consumed. This means that subsequent reads will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3183) different. The trace is live.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3184) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3185)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3186) # echo function > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3187) # cat trace_pipe > /tmp/trace.out &
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3188) [1] 4153
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3189) # echo 1 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3190) # usleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3191) # echo 0 > tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3192) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3193) # tracer: function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3194) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3195) # entries-in-buffer/entries-written: 0/0 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3196) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3197) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3198) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3199) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3200) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3201) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3202) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3203) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3204)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3205) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3206) # cat /tmp/trace.out
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3207) bash-1994 [000] .... 5281.568961: mutex_unlock <-rb_simple_write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3208) bash-1994 [000] .... 5281.568963: __mutex_unlock_slowpath <-mutex_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3209) bash-1994 [000] .... 5281.568963: __fsnotify_parent <-fsnotify_modify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3210) bash-1994 [000] .... 5281.568964: fsnotify <-fsnotify_modify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3211) bash-1994 [000] .... 5281.568964: __srcu_read_lock <-fsnotify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3212) bash-1994 [000] .... 5281.568964: add_preempt_count <-__srcu_read_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3213) bash-1994 [000] ...1 5281.568965: sub_preempt_count <-__srcu_read_lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3214) bash-1994 [000] .... 5281.568965: __srcu_read_unlock <-fsnotify
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3215) bash-1994 [000] .... 5281.568967: sys_dup2 <-system_call_fastpath
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3216)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3217)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3218) Note, reading the trace_pipe file will block until more input is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3219) added. This is contrary to the trace file. If any process opened
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3220) the trace file for reading, it will actually disable tracing and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3221) prevent new entries from being added. The trace_pipe file does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3222) not have this limitation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3223)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3224) trace entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3225) -------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3226)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3227) Having too much or not enough data can be troublesome in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3228) diagnosing an issue in the kernel. The file buffer_size_kb is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3229) used to modify the size of the internal trace buffers. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3230) number listed is the number of entries that can be recorded per
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3231) CPU. To know the full size, multiply the number of possible CPUs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3232) with the number of entries.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3233) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3234)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3235) # cat buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3236) 1408 (units kilobytes)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3237)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3238) Or simply read buffer_total_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3239) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3240)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3241) # cat buffer_total_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3242) 5632
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3243)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3244) To modify the buffer, simple echo in a number (in 1024 byte segments).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3245) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3246)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3247) # echo 10000 > buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3248) # cat buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3249) 10000 (units kilobytes)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3250)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3251) It will try to allocate as much as possible. If you allocate too
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3252) much, it can cause Out-Of-Memory to trigger.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3253) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3254)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3255) # echo 1000000000000 > buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3256) -bash: echo: write error: Cannot allocate memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3257) # cat buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3258) 85
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3259)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3260) The per_cpu buffers can be changed individually as well:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3261) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3262)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3263) # echo 10000 > per_cpu/cpu0/buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3264) # echo 100 > per_cpu/cpu1/buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3265)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3266) When the per_cpu buffers are not the same, the buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3267) at the top level will just show an X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3268) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3269)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3270) # cat buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3271) X
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3272)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3273) This is where the buffer_total_size_kb is useful:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3274) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3275)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3276) # cat buffer_total_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3277) 12916
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3278)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3279) Writing to the top level buffer_size_kb will reset all the buffers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3280) to be the same again.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3281)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3282) Snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3283) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3284) CONFIG_TRACER_SNAPSHOT makes a generic snapshot feature
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3285) available to all non latency tracers. (Latency tracers which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3286) record max latency, such as "irqsoff" or "wakeup", can't use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3287) this feature, since those are already using the snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3288) mechanism internally.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3289)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3290) Snapshot preserves a current trace buffer at a particular point
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3291) in time without stopping tracing. Ftrace swaps the current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3292) buffer with a spare buffer, and tracing continues in the new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3293) current (=previous spare) buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3294)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3295) The following tracefs files in "tracing" are related to this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3296) feature:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3297)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3298) snapshot:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3299)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3300) This is used to take a snapshot and to read the output
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3301) of the snapshot. Echo 1 into this file to allocate a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3302) spare buffer and to take a snapshot (swap), then read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3303) the snapshot from this file in the same format as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3304) "trace" (described above in the section "The File
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3305) System"). Both reads snapshot and tracing are executable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3306) in parallel. When the spare buffer is allocated, echoing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3307) 0 frees it, and echoing else (positive) values clear the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3308) snapshot contents.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3309) More details are shown in the table below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3311) +--------------+------------+------------+------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3312) |status\\input | 0 | 1 | else |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3313) +==============+============+============+============+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3314) |not allocated |(do nothing)| alloc+swap |(do nothing)|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3315) +--------------+------------+------------+------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3316) |allocated | free | swap | clear |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3317) +--------------+------------+------------+------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3318)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3319) Here is an example of using the snapshot feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3320) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3321)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3322) # echo 1 > events/sched/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3323) # echo 1 > snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3324) # cat snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3325) # tracer: nop
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3326) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3327) # entries-in-buffer/entries-written: 71/71 #P:8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3328) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3329) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3330) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3331) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3332) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3333) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3334) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3335) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3336) <idle>-0 [005] d... 2440.603828: sched_switch: prev_comm=swapper/5 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2242 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3337) sleep-2242 [005] d... 2440.603846: sched_switch: prev_comm=snapshot-test-2 prev_pid=2242 prev_prio=120 prev_state=R ==> next_comm=kworker/5:1 next_pid=60 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3338) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3339) <idle>-0 [002] d... 2440.707230: sched_switch: prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2229 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3340)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3341) # cat trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3342) # tracer: nop
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3343) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3344) # entries-in-buffer/entries-written: 77/77 #P:8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3345) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3346) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3347) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3348) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3349) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3350) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3351) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3352) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3353) <idle>-0 [007] d... 2440.707395: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=snapshot-test-2 next_pid=2243 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3354) snapshot-test-2-2229 [002] d... 2440.707438: sched_switch: prev_comm=snapshot-test-2 prev_pid=2229 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3355) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3356)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3357)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3358) If you try to use this snapshot feature when current tracer is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3359) one of the latency tracers, you will get the following results.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3360) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3361)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3362) # echo wakeup > current_tracer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3363) # echo 1 > snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3364) bash: echo: write error: Device or resource busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3365) # cat snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3366) cat: snapshot: Device or resource busy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3367)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3368)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3369) Instances
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3370) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3371) In the tracefs tracing directory is a directory called "instances".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3372) This directory can have new directories created inside of it using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3373) mkdir, and removing directories with rmdir. The directory created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3374) with mkdir in this directory will already contain files and other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3375) directories after it is created.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3376) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3377)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3378) # mkdir instances/foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3379) # ls instances/foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3380) buffer_size_kb buffer_total_size_kb events free_buffer per_cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3381) set_event snapshot trace trace_clock trace_marker trace_options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3382) trace_pipe tracing_on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3383)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3384) As you can see, the new directory looks similar to the tracing directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3385) itself. In fact, it is very similar, except that the buffer and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3386) events are agnostic from the main directory, or from any other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3387) instances that are created.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3388)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3389) The files in the new directory work just like the files with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3390) same name in the tracing directory except the buffer that is used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3391) is a separate and new buffer. The files affect that buffer but do not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3392) affect the main buffer with the exception of trace_options. Currently,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3393) the trace_options affect all instances and the top level buffer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3394) the same, but this may change in future releases. That is, options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3395) may become specific to the instance they reside in.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3396)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3397) Notice that none of the function tracer files are there, nor is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3398) current_tracer and available_tracers. This is because the buffers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3399) can currently only have events enabled for them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3400) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3401)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3402) # mkdir instances/foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3403) # mkdir instances/bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3404) # mkdir instances/zoot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3405) # echo 100000 > buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3406) # echo 1000 > instances/foo/buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3407) # echo 5000 > instances/bar/per_cpu/cpu1/buffer_size_kb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3408) # echo function > current_trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3409) # echo 1 > instances/foo/events/sched/sched_wakeup/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3410) # echo 1 > instances/foo/events/sched/sched_wakeup_new/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3411) # echo 1 > instances/foo/events/sched/sched_switch/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3412) # echo 1 > instances/bar/events/irq/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3413) # echo 1 > instances/zoot/events/syscalls/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3414) # cat trace_pipe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3415) CPU:2 [LOST 11745 EVENTS]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3416) bash-2044 [002] .... 10594.481032: _raw_spin_lock_irqsave <-get_page_from_freelist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3417) bash-2044 [002] d... 10594.481032: add_preempt_count <-_raw_spin_lock_irqsave
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3418) bash-2044 [002] d..1 10594.481032: __rmqueue <-get_page_from_freelist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3419) bash-2044 [002] d..1 10594.481033: _raw_spin_unlock <-get_page_from_freelist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3420) bash-2044 [002] d..1 10594.481033: sub_preempt_count <-_raw_spin_unlock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3421) bash-2044 [002] d... 10594.481033: get_pageblock_flags_group <-get_pageblock_migratetype
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3422) bash-2044 [002] d... 10594.481034: __mod_zone_page_state <-get_page_from_freelist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3423) bash-2044 [002] d... 10594.481034: zone_statistics <-get_page_from_freelist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3424) bash-2044 [002] d... 10594.481034: __inc_zone_state <-zone_statistics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3425) bash-2044 [002] d... 10594.481034: __inc_zone_state <-zone_statistics
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3426) bash-2044 [002] .... 10594.481035: arch_dup_task_struct <-copy_process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3427) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3428)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3429) # cat instances/foo/trace_pipe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3430) bash-1998 [000] d..4 136.676759: sched_wakeup: comm=kworker/0:1 pid=59 prio=120 success=1 target_cpu=000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3431) bash-1998 [000] dN.4 136.676760: sched_wakeup: comm=bash pid=1998 prio=120 success=1 target_cpu=000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3432) <idle>-0 [003] d.h3 136.676906: sched_wakeup: comm=rcu_preempt pid=9 prio=120 success=1 target_cpu=003
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3433) <idle>-0 [003] d..3 136.676909: sched_switch: prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=9 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3434) rcu_preempt-9 [003] d..3 136.676916: sched_switch: prev_comm=rcu_preempt prev_pid=9 prev_prio=120 prev_state=S ==> next_comm=swapper/3 next_pid=0 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3435) bash-1998 [000] d..4 136.677014: sched_wakeup: comm=kworker/0:1 pid=59 prio=120 success=1 target_cpu=000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3436) bash-1998 [000] dN.4 136.677016: sched_wakeup: comm=bash pid=1998 prio=120 success=1 target_cpu=000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3437) bash-1998 [000] d..3 136.677018: sched_switch: prev_comm=bash prev_pid=1998 prev_prio=120 prev_state=R+ ==> next_comm=kworker/0:1 next_pid=59 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3438) kworker/0:1-59 [000] d..4 136.677022: sched_wakeup: comm=sshd pid=1995 prio=120 success=1 target_cpu=001
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3439) kworker/0:1-59 [000] d..3 136.677025: sched_switch: prev_comm=kworker/0:1 prev_pid=59 prev_prio=120 prev_state=S ==> next_comm=bash next_pid=1998 next_prio=120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3440) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3441)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3442) # cat instances/bar/trace_pipe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3443) migration/1-14 [001] d.h3 138.732674: softirq_raise: vec=3 [action=NET_RX]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3444) <idle>-0 [001] dNh3 138.732725: softirq_raise: vec=3 [action=NET_RX]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3445) bash-1998 [000] d.h1 138.733101: softirq_raise: vec=1 [action=TIMER]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3446) bash-1998 [000] d.h1 138.733102: softirq_raise: vec=9 [action=RCU]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3447) bash-1998 [000] ..s2 138.733105: softirq_entry: vec=1 [action=TIMER]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3448) bash-1998 [000] ..s2 138.733106: softirq_exit: vec=1 [action=TIMER]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3449) bash-1998 [000] ..s2 138.733106: softirq_entry: vec=9 [action=RCU]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3450) bash-1998 [000] ..s2 138.733109: softirq_exit: vec=9 [action=RCU]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3451) sshd-1995 [001] d.h1 138.733278: irq_handler_entry: irq=21 name=uhci_hcd:usb4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3452) sshd-1995 [001] d.h1 138.733280: irq_handler_exit: irq=21 ret=unhandled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3453) sshd-1995 [001] d.h1 138.733281: irq_handler_entry: irq=21 name=eth0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3454) sshd-1995 [001] d.h1 138.733283: irq_handler_exit: irq=21 ret=handled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3455) [...]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3456)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3457) # cat instances/zoot/trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3458) # tracer: nop
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3459) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3460) # entries-in-buffer/entries-written: 18996/18996 #P:4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3461) #
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3462) # _-----=> irqs-off
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3463) # / _----=> need-resched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3464) # | / _---=> hardirq/softirq
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3465) # || / _--=> preempt-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3466) # ||| / delay
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3467) # TASK-PID CPU# |||| TIMESTAMP FUNCTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3468) # | | | |||| | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3469) bash-1998 [000] d... 140.733501: sys_write -> 0x2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3470) bash-1998 [000] d... 140.733504: sys_dup2(oldfd: a, newfd: 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3471) bash-1998 [000] d... 140.733506: sys_dup2 -> 0x1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3472) bash-1998 [000] d... 140.733508: sys_fcntl(fd: a, cmd: 1, arg: 0)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3473) bash-1998 [000] d... 140.733509: sys_fcntl -> 0x1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3474) bash-1998 [000] d... 140.733510: sys_close(fd: a)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3475) bash-1998 [000] d... 140.733510: sys_close -> 0x0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3476) bash-1998 [000] d... 140.733514: sys_rt_sigprocmask(how: 0, nset: 0, oset: 6e2768, sigsetsize: 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3477) bash-1998 [000] d... 140.733515: sys_rt_sigprocmask -> 0x0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3478) bash-1998 [000] d... 140.733516: sys_rt_sigaction(sig: 2, act: 7fff718846f0, oact: 7fff71884650, sigsetsize: 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3479) bash-1998 [000] d... 140.733516: sys_rt_sigaction -> 0x0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3480)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3481) You can see that the trace of the top most trace buffer shows only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3482) the function tracing. The foo instance displays wakeups and task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3483) switches.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3484)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3485) To remove the instances, simply delete their directories:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3486) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3487)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3488) # rmdir instances/foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3489) # rmdir instances/bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3490) # rmdir instances/zoot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3491)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3492) Note, if a process has a trace file open in one of the instance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3493) directories, the rmdir will fail with EBUSY.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3494)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3495)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3496) Stack trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3497) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3498) Since the kernel has a fixed sized stack, it is important not to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3499) waste it in functions. A kernel developer must be conscience of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3500) what they allocate on the stack. If they add too much, the system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3501) can be in danger of a stack overflow, and corruption will occur,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3502) usually leading to a system panic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3503)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3504) There are some tools that check this, usually with interrupts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3505) periodically checking usage. But if you can perform a check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3506) at every function call that will become very useful. As ftrace provides
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3507) a function tracer, it makes it convenient to check the stack size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3508) at every function call. This is enabled via the stack tracer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3509)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3510) CONFIG_STACK_TRACER enables the ftrace stack tracing functionality.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3511) To enable it, write a '1' into /proc/sys/kernel/stack_tracer_enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3512) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3513)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3514) # echo 1 > /proc/sys/kernel/stack_tracer_enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3515)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3516) You can also enable it from the kernel command line to trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3517) the stack size of the kernel during boot up, by adding "stacktrace"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3518) to the kernel command line parameter.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3519)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3520) After running it for a few minutes, the output looks like:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3521) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3522)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3523) # cat stack_max_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3524) 2928
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3525)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3526) # cat stack_trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3527) Depth Size Location (18 entries)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3528) ----- ---- --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3529) 0) 2928 224 update_sd_lb_stats+0xbc/0x4ac
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3530) 1) 2704 160 find_busiest_group+0x31/0x1f1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3531) 2) 2544 256 load_balance+0xd9/0x662
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3532) 3) 2288 80 idle_balance+0xbb/0x130
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3533) 4) 2208 128 __schedule+0x26e/0x5b9
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3534) 5) 2080 16 schedule+0x64/0x66
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3535) 6) 2064 128 schedule_timeout+0x34/0xe0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3536) 7) 1936 112 wait_for_common+0x97/0xf1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3537) 8) 1824 16 wait_for_completion+0x1d/0x1f
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3538) 9) 1808 128 flush_work+0xfe/0x119
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3539) 10) 1680 16 tty_flush_to_ldisc+0x1e/0x20
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3540) 11) 1664 48 input_available_p+0x1d/0x5c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3541) 12) 1616 48 n_tty_poll+0x6d/0x134
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3542) 13) 1568 64 tty_poll+0x64/0x7f
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3543) 14) 1504 880 do_select+0x31e/0x511
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3544) 15) 624 400 core_sys_select+0x177/0x216
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3545) 16) 224 96 sys_select+0x91/0xb9
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3546) 17) 128 128 system_call_fastpath+0x16/0x1b
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3547)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3548) Note, if -mfentry is being used by gcc, functions get traced before
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3549) they set up the stack frame. This means that leaf level functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3550) are not tested by the stack tracer when -mfentry is used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3551)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3552) Currently, -mfentry is used by gcc 4.6.0 and above on x86 only.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3553)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3554) More
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3555) ----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3556) More details can be found in the source code, in the `kernel/trace/*.c` files.