Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    1) =============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    2) Event Tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    3) =============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    5) :Author: Theodore Ts'o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    6) :Updated: Li Zefan and Tom Zanussi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    8) 1. Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300    9) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   11) Tracepoints (see Documentation/trace/tracepoints.rst) can be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   12) without creating custom kernel modules to register probe functions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   13) using the event tracing infrastructure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   15) Not all tracepoints can be traced using the event tracing system;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   16) the kernel developer must provide code snippets which define how the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   17) tracing information is saved into the tracing buffer, and how the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   18) tracing information should be printed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   20) 2. Using Event Tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   21) ======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   23) 2.1 Via the 'set_event' interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   24) ---------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   26) The events which are available for tracing can be found in the file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   27) /sys/kernel/debug/tracing/available_events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   29) To enable a particular event, such as 'sched_wakeup', simply echo it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   30) to /sys/kernel/debug/tracing/set_event. For example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   32) 	# echo sched_wakeup >> /sys/kernel/debug/tracing/set_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   34) .. Note:: '>>' is necessary, otherwise it will firstly disable all the events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   35) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   36) To disable an event, echo the event name to the set_event file prefixed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   37) with an exclamation point::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   39) 	# echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   41) To disable all events, echo an empty line to the set_event file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   43) 	# echo > /sys/kernel/debug/tracing/set_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   44) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   45) To enable all events, echo ``*:*`` or ``*:`` to the set_event file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   47) 	# echo *:* > /sys/kernel/debug/tracing/set_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   48) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   49) The events are organized into subsystems, such as ext4, irq, sched,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   50) etc., and a full event name looks like this: <subsystem>:<event>.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   51) subsystem name is optional, but it is displayed in the available_events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   52) file.  All of the events in a subsystem can be specified via the syntax
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   53) ``<subsystem>:*``; for example, to enable all irq events, you can use the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   54) command::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   56) 	# echo 'irq:*' > /sys/kernel/debug/tracing/set_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   58) 2.2 Via the 'enable' toggle
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   59) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   61) The events available are also listed in /sys/kernel/debug/tracing/events/ hierarchy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   62) of directories.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   64) To enable event 'sched_wakeup'::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   66) 	# echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   68) To disable it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   69) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   70) 	# echo 0 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   71) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   72) To enable all events in sched subsystem::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   74) 	# echo 1 > /sys/kernel/debug/tracing/events/sched/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   75) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   76) To enable all events::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   77) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   78) 	# echo 1 > /sys/kernel/debug/tracing/events/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   80) When reading one of these enable files, there are four results:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   82)  - 0 - all events this file affects are disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   83)  - 1 - all events this file affects are enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   84)  - X - there is a mixture of events enabled and disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   85)  - ? - this file does not affect any event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   86) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   87) 2.3 Boot option
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   88) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   89) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   90) In order to facilitate early boot debugging, use boot option::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   92) 	trace_event=[event-list]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   93) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   94) event-list is a comma separated list of events. See section 2.1 for event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   95) format.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   97) 3. Defining an event-enabled tracepoint
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   98) =======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  100) See The example provided in samples/trace_events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  102) 4. Event formats
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  103) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  104) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  105) Each trace event has a 'format' file associated with it that contains
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  106) a description of each field in a logged event.  This information can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  107) be used to parse the binary trace stream, and is also the place to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  108) find the field names that can be used in event filters (see section 5).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  110) It also displays the format string that will be used to print the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  111) event in text mode, along with the event name and ID used for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  112) profiling.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  114) Every event has a set of ``common`` fields associated with it; these are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  115) the fields prefixed with ``common_``.  The other fields vary between
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  116) events and correspond to the fields defined in the TRACE_EVENT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  117) definition for that event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  118) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  119) Each field in the format has the form::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  121)      field:field-type field-name; offset:N; size:N;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  122) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  123) where offset is the offset of the field in the trace record and size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  124) is the size of the data item, in bytes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  126) For example, here's the information displayed for the 'sched_wakeup'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  127) event::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  129) 	# cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  130) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  131) 	name: sched_wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  132) 	ID: 60
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  133) 	format:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  134) 		field:unsigned short common_type;	offset:0;	size:2;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  135) 		field:unsigned char common_flags;	offset:2;	size:1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  136) 		field:unsigned char common_preempt_count;	offset:3;	size:1;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  137) 		field:int common_pid;	offset:4;	size:4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  138) 		field:int common_tgid;	offset:8;	size:4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  139) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  140) 		field:char comm[TASK_COMM_LEN];	offset:12;	size:16;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  141) 		field:pid_t pid;	offset:28;	size:4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  142) 		field:int prio;	offset:32;	size:4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  143) 		field:int success;	offset:36;	size:4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  144) 		field:int cpu;	offset:40;	size:4;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  146) 	print fmt: "task %s:%d [%d] success=%d [%03d]", REC->comm, REC->pid,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  147) 		   REC->prio, REC->success, REC->cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  148) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  149) This event contains 10 fields, the first 5 common and the remaining 5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  150) event-specific.  All the fields for this event are numeric, except for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  151) 'comm' which is a string, a distinction important for event filtering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  152) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  153) 5. Event filtering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  154) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  155) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  156) Trace events can be filtered in the kernel by associating boolean
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  157) 'filter expressions' with them.  As soon as an event is logged into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  158) the trace buffer, its fields are checked against the filter expression
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  159) associated with that event type.  An event with field values that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  160) 'match' the filter will appear in the trace output, and an event whose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  161) values don't match will be discarded.  An event with no filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  162) associated with it matches everything, and is the default when no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  163) filter has been set for an event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  164) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  165) 5.1 Expression syntax
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  166) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  167) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  168) A filter expression consists of one or more 'predicates' that can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  169) combined using the logical operators '&&' and '||'.  A predicate is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  170) simply a clause that compares the value of a field contained within a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  171) logged event with a constant value and returns either 0 or 1 depending
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  172) on whether the field value matched (1) or didn't match (0)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  174) 	  field-name relational-operator value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  176) Parentheses can be used to provide arbitrary logical groupings and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  177) double-quotes can be used to prevent the shell from interpreting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  178) operators as shell metacharacters.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  180) The field-names available for use in filters can be found in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  181) 'format' files for trace events (see section 4).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  183) The relational-operators depend on the type of the field being tested:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  184) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  185) The operators available for numeric fields are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  186) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  187) ==, !=, <, <=, >, >=, &
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  189) And for string fields they are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  190) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  191) ==, !=, ~
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  192) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  193) The glob (~) accepts a wild card character (\*,?) and character classes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  194) ([). For example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  195) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  196)   prev_comm ~ "*sh"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  197)   prev_comm ~ "sh*"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  198)   prev_comm ~ "*sh*"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  199)   prev_comm ~ "ba*sh"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  200) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  201) If the field is a pointer that points into user space (for example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  202) "filename" from sys_enter_openat), then you have to append ".ustring" to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  203) field name::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  204) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  205)   filename.ustring ~ "password"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  206) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  207) As the kernel will have to know how to retrieve the memory that the pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  208) is at from user space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  210) 5.2 Setting filters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  211) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  212) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  213) A filter for an individual event is set by writing a filter expression
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  214) to the 'filter' file for the given event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  216) For example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  217) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  218) 	# cd /sys/kernel/debug/tracing/events/sched/sched_wakeup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  219) 	# echo "common_preempt_count > 4" > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  220) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  221) A slightly more involved example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  222) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  223) 	# cd /sys/kernel/debug/tracing/events/signal/signal_generate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  224) 	# echo "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  225) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  226) If there is an error in the expression, you'll get an 'Invalid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  227) argument' error when setting it, and the erroneous string along with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  228) an error message can be seen by looking at the filter e.g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  229) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  230) 	# cd /sys/kernel/debug/tracing/events/signal/signal_generate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  231) 	# echo "((sig >= 10 && sig < 15) || dsig == 17) && comm != bash" > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  232) 	-bash: echo: write error: Invalid argument
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  233) 	# cat filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  234) 	((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  235) 	^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  236) 	parse_error: Field not found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  238) Currently the caret ('^') for an error always appears at the beginning of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  239) the filter string; the error message should still be useful though
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  240) even without more accurate position info.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  241) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  242) 5.2.1 Filter limitations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  243) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  244) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  245) If a filter is placed on a string pointer ``(char *)`` that does not point
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  246) to a string on the ring buffer, but instead points to kernel or user space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  247) memory, then, for safety reasons, at most 1024 bytes of the content is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  248) copied onto a temporary buffer to do the compare. If the copy of the memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  249) faults (the pointer points to memory that should not be accessed), then the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  250) string compare will be treated as not matching.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  251) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  252) 5.3 Clearing filters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  253) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  254) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  255) To clear the filter for an event, write a '0' to the event's filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  256) file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  257) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  258) To clear the filters for all events in a subsystem, write a '0' to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  259) subsystem's filter file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  260) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  261) 5.3 Subsystem filters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  262) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  264) For convenience, filters for every event in a subsystem can be set or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  265) cleared as a group by writing a filter expression into the filter file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  266) at the root of the subsystem.  Note however, that if a filter for any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  267) event within the subsystem lacks a field specified in the subsystem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  268) filter, or if the filter can't be applied for any other reason, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  269) filter for that event will retain its previous setting.  This can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  270) result in an unintended mixture of filters which could lead to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  271) confusing (to the user who might think different filters are in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  272) effect) trace output.  Only filters that reference just the common
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  273) fields can be guaranteed to propagate successfully to all events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  274) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  275) Here are a few subsystem filter examples that also illustrate the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  276) above points:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  277) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  278) Clear the filters on all events in the sched subsystem::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  279) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  280) 	# cd /sys/kernel/debug/tracing/events/sched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  281) 	# echo 0 > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  282) 	# cat sched_switch/filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  283) 	none
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  284) 	# cat sched_wakeup/filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  285) 	none
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  286) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  287) Set a filter using only common fields for all events in the sched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  288) subsystem (all events end up with the same filter)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  289) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  290) 	# cd /sys/kernel/debug/tracing/events/sched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  291) 	# echo common_pid == 0 > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  292) 	# cat sched_switch/filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  293) 	common_pid == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  294) 	# cat sched_wakeup/filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  295) 	common_pid == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  296) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  297) Attempt to set a filter using a non-common field for all events in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  298) sched subsystem (all events but those that have a prev_pid field retain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  299) their old filters)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  300) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  301) 	# cd /sys/kernel/debug/tracing/events/sched
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  302) 	# echo prev_pid == 0 > filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  303) 	# cat sched_switch/filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  304) 	prev_pid == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  305) 	# cat sched_wakeup/filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  306) 	common_pid == 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  307) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  308) 5.4 PID filtering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  309) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  310) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  311) The set_event_pid file in the same directory as the top events directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  312) exists, will filter all events from tracing any task that does not have the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  313) PID listed in the set_event_pid file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  314) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  315) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  316) 	# cd /sys/kernel/debug/tracing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  317) 	# echo $$ > set_event_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  318) 	# echo 1 > events/enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  319) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  320) Will only trace events for the current task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  321) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  322) To add more PIDs without losing the PIDs already included, use '>>'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  323) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  324) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  325) 	# echo 123 244 1 >> set_event_pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  326) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  327) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  328) 6. Event triggers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  329) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  330) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  331) Trace events can be made to conditionally invoke trigger 'commands'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  332) which can take various forms and are described in detail below;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  333) examples would be enabling or disabling other trace events or invoking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  334) a stack trace whenever the trace event is hit.  Whenever a trace event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  335) with attached triggers is invoked, the set of trigger commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  336) associated with that event is invoked.  Any given trigger can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  337) additionally have an event filter of the same form as described in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  338) section 5 (Event filtering) associated with it - the command will only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  339) be invoked if the event being invoked passes the associated filter.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  340) If no filter is associated with the trigger, it always passes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  341) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  342) Triggers are added to and removed from a particular event by writing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  343) trigger expressions to the 'trigger' file for the given event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  344) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  345) A given event can have any number of triggers associated with it,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  346) subject to any restrictions that individual commands may have in that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  347) regard.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  348) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  349) Event triggers are implemented on top of "soft" mode, which means that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  350) whenever a trace event has one or more triggers associated with it,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  351) the event is activated even if it isn't actually enabled, but is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  352) disabled in a "soft" mode.  That is, the tracepoint will be called,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  353) but just will not be traced, unless of course it's actually enabled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  354) This scheme allows triggers to be invoked even for events that aren't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  355) enabled, and also allows the current event filter implementation to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  356) used for conditionally invoking triggers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  357) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  358) The syntax for event triggers is roughly based on the syntax for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  359) set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  360) section of Documentation/trace/ftrace.rst), but there are major
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  361) differences and the implementation isn't currently tied to it in any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  362) way, so beware about making generalizations between the two.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  363) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  364) .. Note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  365)      Writing into trace_marker (See Documentation/trace/ftrace.rst)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  366)      can also enable triggers that are written into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  367)      /sys/kernel/tracing/events/ftrace/print/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  368) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  369) 6.1 Expression syntax
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  370) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  371) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  372) Triggers are added by echoing the command to the 'trigger' file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  373) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  374)   # echo 'command[:count] [if filter]' > trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  375) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  376) Triggers are removed by echoing the same command but starting with '!'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  377) to the 'trigger' file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  378) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  379)   # echo '!command[:count] [if filter]' > trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  380) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  381) The [if filter] part isn't used in matching commands when removing, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  382) leaving that off in a '!' command will accomplish the same thing as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  383) having it in.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  384) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  385) The filter syntax is the same as that described in the 'Event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  386) filtering' section above.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  387) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  388) For ease of use, writing to the trigger file using '>' currently just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  389) adds or removes a single trigger and there's no explicit '>>' support
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  390) ('>' actually behaves like '>>') or truncation support to remove all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  391) triggers (you have to use '!' for each one added.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  392) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  393) 6.2 Supported trigger commands
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  394) ------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  395) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  396) The following commands are supported:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  397) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  398) - enable_event/disable_event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  399) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  400)   These commands can enable or disable another trace event whenever
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  401)   the triggering event is hit.  When these commands are registered,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  402)   the other trace event is activated, but disabled in a "soft" mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  403)   That is, the tracepoint will be called, but just will not be traced.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  404)   The event tracepoint stays in this mode as long as there's a trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  405)   in effect that can trigger it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  406) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  407)   For example, the following trigger causes kmalloc events to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  408)   traced when a read system call is entered, and the :1 at the end
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  409)   specifies that this enablement happens only once::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  410) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  411) 	  # echo 'enable_event:kmem:kmalloc:1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  412) 	      /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  413) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  414)   The following trigger causes kmalloc events to stop being traced
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  415)   when a read system call exits.  This disablement happens on every
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  416)   read system call exit::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  417) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  418) 	  # echo 'disable_event:kmem:kmalloc' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  419) 	      /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  420) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  421)   The format is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  422) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  423)       enable_event:<system>:<event>[:count]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  424)       disable_event:<system>:<event>[:count]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  425) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  426)   To remove the above commands::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  427) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  428) 	  # echo '!enable_event:kmem:kmalloc:1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  429) 	      /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  430) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  431) 	  # echo '!disable_event:kmem:kmalloc' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  432) 	      /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  433) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  434)   Note that there can be any number of enable/disable_event triggers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  435)   per triggering event, but there can only be one trigger per
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  436)   triggered event. e.g. sys_enter_read can have triggers enabling both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  437)   kmem:kmalloc and sched:sched_switch, but can't have two kmem:kmalloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  438)   versions such as kmem:kmalloc and kmem:kmalloc:1 or 'kmem:kmalloc if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  439)   bytes_req == 256' and 'kmem:kmalloc if bytes_alloc == 256' (they
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  440)   could be combined into a single filter on kmem:kmalloc though).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  441) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  442) - stacktrace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  443) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  444)   This command dumps a stacktrace in the trace buffer whenever the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  445)   triggering event occurs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  446) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  447)   For example, the following trigger dumps a stacktrace every time the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  448)   kmalloc tracepoint is hit::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  449) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  450) 	  # echo 'stacktrace' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  451) 		/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  452) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  453)   The following trigger dumps a stacktrace the first 5 times a kmalloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  454)   request happens with a size >= 64K::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  455) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  456) 	  # echo 'stacktrace:5 if bytes_req >= 65536' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  457) 		/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  458) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  459)   The format is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  460) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  461)       stacktrace[:count]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  462) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  463)   To remove the above commands::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  464) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  465) 	  # echo '!stacktrace' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  466) 		/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  467) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  468) 	  # echo '!stacktrace:5 if bytes_req >= 65536' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  469) 		/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  470) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  471)   The latter can also be removed more simply by the following (without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  472)   the filter)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  473) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  474) 	  # echo '!stacktrace:5' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  475) 		/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  476) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  477)   Note that there can be only one stacktrace trigger per triggering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  478)   event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  479) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  480) - snapshot
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  481) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  482)   This command causes a snapshot to be triggered whenever the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  483)   triggering event occurs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  484) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  485)   The following command creates a snapshot every time a block request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  486)   queue is unplugged with a depth > 1.  If you were tracing a set of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  487)   events or functions at the time, the snapshot trace buffer would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  488)   capture those events when the trigger event occurred::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  489) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  490) 	  # echo 'snapshot if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  491) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  492) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  493)   To only snapshot once::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  494) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  495) 	  # echo 'snapshot:1 if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  496) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  497) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  498)   To remove the above commands::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  499) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  500) 	  # echo '!snapshot if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  501) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  502) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  503) 	  # echo '!snapshot:1 if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  504) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  505) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  506)   Note that there can be only one snapshot trigger per triggering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  507)   event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  508) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  509) - traceon/traceoff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  510) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  511)   These commands turn tracing on and off when the specified events are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  512)   hit. The parameter determines how many times the tracing system is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  513)   turned on and off. If unspecified, there is no limit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  514) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  515)   The following command turns tracing off the first time a block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  516)   request queue is unplugged with a depth > 1.  If you were tracing a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  517)   set of events or functions at the time, you could then examine the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  518)   trace buffer to see the sequence of events that led up to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  519)   trigger event::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  520) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  521) 	  # echo 'traceoff:1 if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  522) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  523) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  524)   To always disable tracing when nr_rq  > 1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  525) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  526) 	  # echo 'traceoff if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  527) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  528) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  529)   To remove the above commands::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  530) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  531) 	  # echo '!traceoff:1 if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  532) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  533) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  534) 	  # echo '!traceoff if nr_rq > 1' > \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  535) 		/sys/kernel/debug/tracing/events/block/block_unplug/trigger
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  536) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  537)   Note that there can be only one traceon or traceoff trigger per
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  538)   triggering event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  539) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  540) - hist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  541) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  542)   This command aggregates event hits into a hash table keyed on one or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  543)   more trace event format fields (or stacktrace) and a set of running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  544)   totals derived from one or more trace event format fields and/or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  545)   event counts (hitcount).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  546) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  547)   See Documentation/trace/histogram.rst for details and examples.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  548) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  549) 7. In-kernel trace event API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  550) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  551) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  552) In most cases, the command-line interface to trace events is more than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  553) sufficient.  Sometimes, however, applications might find the need for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  554) more complex relationships than can be expressed through a simple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  555) series of linked command-line expressions, or putting together sets of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  556) commands may be simply too cumbersome.  An example might be an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  557) application that needs to 'listen' to the trace stream in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  558) maintain an in-kernel state machine detecting, for instance, when an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  559) illegal kernel state occurs in the scheduler.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  560) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  561) The trace event subsystem provides an in-kernel API allowing modules
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  562) or other kernel code to generate user-defined 'synthetic' events at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  563) will, which can be used to either augment the existing trace stream
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  564) and/or signal that a particular important state has occurred.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  565) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  566) A similar in-kernel API is also available for creating kprobe and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  567) kretprobe events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  568) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  569) Both the synthetic event and k/ret/probe event APIs are built on top
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  570) of a lower-level "dynevent_cmd" event command API, which is also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  571) available for more specialized applications, or as the basis of other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  572) higher-level trace event APIs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  573) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  574) The API provided for these purposes is describe below and allows the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  575) following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  576) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  577)   - dynamically creating synthetic event definitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  578)   - dynamically creating kprobe and kretprobe event definitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  579)   - tracing synthetic events from in-kernel code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  580)   - the low-level "dynevent_cmd" API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  581) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  582) 7.1 Dyamically creating synthetic event definitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  583) ---------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  584) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  585) There are a couple ways to create a new synthetic event from a kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  586) module or other kernel code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  587) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  588) The first creates the event in one step, using synth_event_create().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  589) In this method, the name of the event to create and an array defining
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  590) the fields is supplied to synth_event_create().  If successful, a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  591) synthetic event with that name and fields will exist following that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  592) call.  For example, to create a new "schedtest" synthetic event::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  593) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  594)   ret = synth_event_create("schedtest", sched_fields,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  595)                            ARRAY_SIZE(sched_fields), THIS_MODULE);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  596) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  597) The sched_fields param in this example points to an array of struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  598) synth_field_desc, each of which describes an event field by type and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  599) name::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  600) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  601)   static struct synth_field_desc sched_fields[] = {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  602)         { .type = "pid_t",              .name = "next_pid_field" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  603)         { .type = "char[16]",           .name = "next_comm_field" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  604)         { .type = "u64",                .name = "ts_ns" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  605)         { .type = "u64",                .name = "ts_ms" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  606)         { .type = "unsigned int",       .name = "cpu" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  607)         { .type = "char[64]",           .name = "my_string_field" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  608)         { .type = "int",                .name = "my_int_field" },
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  609)   };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  610) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  611) See synth_field_size() for available types.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  612) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  613) If field_name contains [n], the field is considered to be a static array.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  614) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  615) If field_names contains[] (no subscript), the field is considered to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  616) be a dynamic array, which will only take as much space in the event as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  617) is required to hold the array.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  618) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  619) Because space for an event is reserved before assigning field values
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  620) to the event, using dynamic arrays implies that the piecewise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  621) in-kernel API described below can't be used with dynamic arrays.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  622) other non-piecewise in-kernel APIs can, however, be used with dynamic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  623) arrays.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  624) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  625) If the event is created from within a module, a pointer to the module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  626) must be passed to synth_event_create().  This will ensure that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  627) trace buffer won't contain unreadable events when the module is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  628) removed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  629) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  630) At this point, the event object is ready to be used for generating new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  631) events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  632) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  633) In the second method, the event is created in several steps.  This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  634) allows events to be created dynamically and without the need to create
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  635) and populate an array of fields beforehand.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  636) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  637) To use this method, an empty or partially empty synthetic event should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  638) first be created using synth_event_gen_cmd_start() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  639) synth_event_gen_cmd_array_start().  For synth_event_gen_cmd_start(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  640) the name of the event along with one or more pairs of args each pair
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  641) representing a 'type field_name;' field specification should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  642) supplied.  For synth_event_gen_cmd_array_start(), the name of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  643) event along with an array of struct synth_field_desc should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  644) supplied. Before calling synth_event_gen_cmd_start() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  645) synth_event_gen_cmd_array_start(), the user should create and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  646) initialize a dynevent_cmd object using synth_event_cmd_init().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  647) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  648) For example, to create a new "schedtest" synthetic event with two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  649) fields::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  650) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  651)   struct dynevent_cmd cmd;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  652)   char *buf;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  653) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  654)   /* Create a buffer to hold the generated command */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  655)   buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  656) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  657)   /* Before generating the command, initialize the cmd object */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  658)   synth_event_cmd_init(&cmd, buf, MAX_DYNEVENT_CMD_LEN);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  659) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  660)   ret = synth_event_gen_cmd_start(&cmd, "schedtest", THIS_MODULE,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  661)                                   "pid_t", "next_pid_field",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  662)                                   "u64", "ts_ns");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  663) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  664) Alternatively, using an array of struct synth_field_desc fields
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  665) containing the same information::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  666) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  667)   ret = synth_event_gen_cmd_array_start(&cmd, "schedtest", THIS_MODULE,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  668)                                         fields, n_fields);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  669) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  670) Once the synthetic event object has been created, it can then be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  671) populated with more fields.  Fields are added one by one using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  672) synth_event_add_field(), supplying the dynevent_cmd object, a field
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  673) type, and a field name.  For example, to add a new int field named
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  674) "intfield", the following call should be made::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  675) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  676)   ret = synth_event_add_field(&cmd, "int", "intfield");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  677) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  678) See synth_field_size() for available types. If field_name contains [n]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  679) the field is considered to be an array.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  680) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  681) A group of fields can also be added all at once using an array of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  682) synth_field_desc with add_synth_fields().  For example, this would add
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  683) just the first four sched_fields::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  684) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  685)   ret = synth_event_add_fields(&cmd, sched_fields, 4);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  686) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  687) If you already have a string of the form 'type field_name',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  688) synth_event_add_field_str() can be used to add it as-is; it will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  689) also automatically append a ';' to the string.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  690) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  691) Once all the fields have been added, the event should be finalized and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  692) registered by calling the synth_event_gen_cmd_end() function::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  693) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  694)   ret = synth_event_gen_cmd_end(&cmd);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  695) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  696) At this point, the event object is ready to be used for tracing new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  697) events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  698) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  699) 7.2 Tracing synthetic events from in-kernel code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  700) ------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  701) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  702) To trace a synthetic event, there are several options.  The first
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  703) option is to trace the event in one call, using synth_event_trace()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  704) with a variable number of values, or synth_event_trace_array() with an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  705) array of values to be set.  A second option can be used to avoid the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  706) need for a pre-formed array of values or list of arguments, via
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  707) synth_event_trace_start() and synth_event_trace_end() along with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  708) synth_event_add_next_val() or synth_event_add_val() to add the values
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  709) piecewise.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  710) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  711) 7.2.1 Tracing a synthetic event all at once
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  712) -------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  713) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  714) To trace a synthetic event all at once, the synth_event_trace() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  715) synth_event_trace_array() functions can be used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  716) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  717) The synth_event_trace() function is passed the trace_event_file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  718) representing the synthetic event (which can be retrieved using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  719) trace_get_event_file() using the synthetic event name, "synthetic" as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  720) the system name, and the trace instance name (NULL if using the global
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  721) trace array)), along with an variable number of u64 args, one for each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  722) synthetic event field, and the number of values being passed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  723) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  724) So, to trace an event corresponding to the synthetic event definition
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  725) above, code like the following could be used::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  726) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  727)   ret = synth_event_trace(create_synth_test, 7, /* number of values */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  728)                           444,             /* next_pid_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  729)                           (u64)"clackers", /* next_comm_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  730)                           1000000,         /* ts_ns */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  731)                           1000,            /* ts_ms */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  732)                           smp_processor_id(),/* cpu */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  733)                           (u64)"Thneed",   /* my_string_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  734)                           999);            /* my_int_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  735) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  736) All vals should be cast to u64, and string vals are just pointers to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  737) strings, cast to u64.  Strings will be copied into space reserved in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  738) the event for the string, using these pointers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  739) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  740) Alternatively, the synth_event_trace_array() function can be used to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  741) accomplish the same thing.  It is passed the trace_event_file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  742) representing the synthetic event (which can be retrieved using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  743) trace_get_event_file() using the synthetic event name, "synthetic" as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  744) the system name, and the trace instance name (NULL if using the global
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  745) trace array)), along with an array of u64, one for each synthetic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  746) event field.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  747) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  748) To trace an event corresponding to the synthetic event definition
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  749) above, code like the following could be used::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  750) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  751)   u64 vals[7];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  752) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  753)   vals[0] = 777;                  /* next_pid_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  754)   vals[1] = (u64)"tiddlywinks";   /* next_comm_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  755)   vals[2] = 1000000;              /* ts_ns */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  756)   vals[3] = 1000;                 /* ts_ms */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  757)   vals[4] = smp_processor_id();   /* cpu */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  758)   vals[5] = (u64)"thneed";        /* my_string_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  759)   vals[6] = 398;                  /* my_int_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  760) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  761) The 'vals' array is just an array of u64, the number of which must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  762) match the number of field in the synthetic event, and which must be in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  763) the same order as the synthetic event fields.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  764) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  765) All vals should be cast to u64, and string vals are just pointers to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  766) strings, cast to u64.  Strings will be copied into space reserved in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  767) the event for the string, using these pointers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  768) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  769) In order to trace a synthetic event, a pointer to the trace event file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  770) is needed.  The trace_get_event_file() function can be used to get
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  771) it - it will find the file in the given trace instance (in this case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  772) NULL since the top trace array is being used) while at the same time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  773) preventing the instance containing it from going away::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  774) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  775)        schedtest_event_file = trace_get_event_file(NULL, "synthetic",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  776)                                                    "schedtest");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  777) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  778) Before tracing the event, it should be enabled in some way, otherwise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  779) the synthetic event won't actually show up in the trace buffer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  780) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  781) To enable a synthetic event from the kernel, trace_array_set_clr_event()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  782) can be used (which is not specific to synthetic events, so does need
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  783) the "synthetic" system name to be specified explicitly).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  784) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  785) To enable the event, pass 'true' to it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  786) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  787)        trace_array_set_clr_event(schedtest_event_file->tr,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  788)                                  "synthetic", "schedtest", true);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  789) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  790) To disable it pass false::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  791) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  792)        trace_array_set_clr_event(schedtest_event_file->tr,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  793)                                  "synthetic", "schedtest", false);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  794) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  795) Finally, synth_event_trace_array() can be used to actually trace the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  796) event, which should be visible in the trace buffer afterwards::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  797) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  798)        ret = synth_event_trace_array(schedtest_event_file, vals,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  799)                                      ARRAY_SIZE(vals));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  800) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  801) To remove the synthetic event, the event should be disabled, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  802) trace instance should be 'put' back using trace_put_event_file()::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  803) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  804)        trace_array_set_clr_event(schedtest_event_file->tr,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  805)                                  "synthetic", "schedtest", false);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  806)        trace_put_event_file(schedtest_event_file);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  807) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  808) If those have been successful, synth_event_delete() can be called to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  809) remove the event::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  810) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  811)        ret = synth_event_delete("schedtest");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  812) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  813) 7.2.2 Tracing a synthetic event piecewise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  814) -----------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  815) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  816) To trace a synthetic using the piecewise method described above, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  817) synth_event_trace_start() function is used to 'open' the synthetic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  818) event trace::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  819) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  820)        struct synth_trace_state trace_state;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  821) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  822)        ret = synth_event_trace_start(schedtest_event_file, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  823) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  824) It's passed the trace_event_file representing the synthetic event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  825) using the same methods as described above, along with a pointer to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  826) struct synth_trace_state object, which will be zeroed before use and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  827) used to maintain state between this and following calls.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  828) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  829) Once the event has been opened, which means space for it has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  830) reserved in the trace buffer, the individual fields can be set.  There
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  831) are two ways to do that, either one after another for each field in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  832) the event, which requires no lookups, or by name, which does.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  833) tradeoff is flexibility in doing the assignments vs the cost of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  834) lookup per field.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  835) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  836) To assign the values one after the other without lookups,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  837) synth_event_add_next_val() should be used.  Each call is passed the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  838) same synth_trace_state object used in the synth_event_trace_start(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  839) along with the value to set the next field in the event.  After each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  840) field is set, the 'cursor' points to the next field, which will be set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  841) by the subsequent call, continuing until all the fields have been set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  842) in order.  The same sequence of calls as in the above examples using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  843) this method would be (without error-handling code)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  844) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  845)        /* next_pid_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  846)        ret = synth_event_add_next_val(777, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  847) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  848)        /* next_comm_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  849)        ret = synth_event_add_next_val((u64)"slinky", &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  850) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  851)        /* ts_ns */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  852)        ret = synth_event_add_next_val(1000000, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  853) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  854)        /* ts_ms */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  855)        ret = synth_event_add_next_val(1000, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  856) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  857)        /* cpu */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  858)        ret = synth_event_add_next_val(smp_processor_id(), &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  859) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  860)        /* my_string_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  861)        ret = synth_event_add_next_val((u64)"thneed_2.01", &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  862) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  863)        /* my_int_field */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  864)        ret = synth_event_add_next_val(395, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  865) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  866) To assign the values in any order, synth_event_add_val() should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  867) used.  Each call is passed the same synth_trace_state object used in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  868) the synth_event_trace_start(), along with the field name of the field
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  869) to set and the value to set it to.  The same sequence of calls as in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  870) the above examples using this method would be (without error-handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  871) code)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  872) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  873)        ret = synth_event_add_val("next_pid_field", 777, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  874)        ret = synth_event_add_val("next_comm_field", (u64)"silly putty",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  875)                                  &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  876)        ret = synth_event_add_val("ts_ns", 1000000, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  877)        ret = synth_event_add_val("ts_ms", 1000, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  878)        ret = synth_event_add_val("cpu", smp_processor_id(), &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  879)        ret = synth_event_add_val("my_string_field", (u64)"thneed_9",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  880)                                  &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  881)        ret = synth_event_add_val("my_int_field", 3999, &trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  882) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  883) Note that synth_event_add_next_val() and synth_event_add_val() are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  884) incompatible if used within the same trace of an event - either one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  885) can be used but not both at the same time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  886) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  887) Finally, the event won't be actually traced until it's 'closed',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  888) which is done using synth_event_trace_end(), which takes only the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  889) struct synth_trace_state object used in the previous calls::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  890) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  891)        ret = synth_event_trace_end(&trace_state);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  892) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  893) Note that synth_event_trace_end() must be called at the end regardless
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  894) of whether any of the add calls failed (say due to a bad field name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  895) being passed in).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  896) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  897) 7.3 Dyamically creating kprobe and kretprobe event definitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  898) --------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  899) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  900) To create a kprobe or kretprobe trace event from kernel code, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  901) kprobe_event_gen_cmd_start() or kretprobe_event_gen_cmd_start()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  902) functions can be used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  903) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  904) To create a kprobe event, an empty or partially empty kprobe event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  905) should first be created using kprobe_event_gen_cmd_start().  The name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  906) of the event and the probe location should be specfied along with one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  907) or args each representing a probe field should be supplied to this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  908) function.  Before calling kprobe_event_gen_cmd_start(), the user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  909) should create and initialize a dynevent_cmd object using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  910) kprobe_event_cmd_init().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  911) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  912) For example, to create a new "schedtest" kprobe event with two fields::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  913) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  914)   struct dynevent_cmd cmd;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  915)   char *buf;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  916) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  917)   /* Create a buffer to hold the generated command */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  918)   buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  919) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  920)   /* Before generating the command, initialize the cmd object */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  921)   kprobe_event_cmd_init(&cmd, buf, MAX_DYNEVENT_CMD_LEN);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  922) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  923)   /*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  924)    * Define the gen_kprobe_test event with the first 2 kprobe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  925)    * fields.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  926)    */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  927)   ret = kprobe_event_gen_cmd_start(&cmd, "gen_kprobe_test", "do_sys_open",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  928)                                    "dfd=%ax", "filename=%dx");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  929) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  930) Once the kprobe event object has been created, it can then be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  931) populated with more fields.  Fields can be added using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  932) kprobe_event_add_fields(), supplying the dynevent_cmd object along
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  933) with a variable arg list of probe fields.  For example, to add a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  934) couple additional fields, the following call could be made::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  935) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  936)   ret = kprobe_event_add_fields(&cmd, "flags=%cx", "mode=+4($stack)");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  937) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  938) Once all the fields have been added, the event should be finalized and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  939) registered by calling the kprobe_event_gen_cmd_end() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  940) kretprobe_event_gen_cmd_end() functions, depending on whether a kprobe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  941) or kretprobe command was started::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  942) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  943)   ret = kprobe_event_gen_cmd_end(&cmd);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  944) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  945) or::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  946) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  947)   ret = kretprobe_event_gen_cmd_end(&cmd);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  948) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  949) At this point, the event object is ready to be used for tracing new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  950) events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  951) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  952) Similarly, a kretprobe event can be created using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  953) kretprobe_event_gen_cmd_start() with a probe name and location and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  954) additional params such as $retval::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  955) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  956)   ret = kretprobe_event_gen_cmd_start(&cmd, "gen_kretprobe_test",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  957)                                       "do_sys_open", "$retval");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  958) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  959) Similar to the synthetic event case, code like the following can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  960) used to enable the newly created kprobe event::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  961) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  962)   gen_kprobe_test = trace_get_event_file(NULL, "kprobes", "gen_kprobe_test");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  963) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  964)   ret = trace_array_set_clr_event(gen_kprobe_test->tr,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  965)                                   "kprobes", "gen_kprobe_test", true);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  966) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  967) Finally, also similar to synthetic events, the following code can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  968) used to give the kprobe event file back and delete the event::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  969) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  970)   trace_put_event_file(gen_kprobe_test);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  971) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  972)   ret = kprobe_event_delete("gen_kprobe_test");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  973) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  974) 7.4 The "dynevent_cmd" low-level API
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  975) ------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  976) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  977) Both the in-kernel synthetic event and kprobe interfaces are built on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  978) top of a lower-level "dynevent_cmd" interface.  This interface is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  979) meant to provide the basis for higher-level interfaces such as the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  980) synthetic and kprobe interfaces, which can be used as examples.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  981) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  982) The basic idea is simple and amounts to providing a general-purpose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  983) layer that can be used to generate trace event commands.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  984) generated command strings can then be passed to the command-parsing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  985) and event creation code that already exists in the trace event
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  986) subystem for creating the corresponding trace events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  987) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  988) In a nutshell, the way it works is that the higher-level interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  989) code creates a struct dynevent_cmd object, then uses a couple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  990) functions, dynevent_arg_add() and dynevent_arg_pair_add() to build up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  991) a command string, which finally causes the command to be executed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  992) using the dynevent_create() function.  The details of the interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  993) are described below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  994) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  995) The first step in building a new command string is to create and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  996) initialize an instance of a dynevent_cmd.  Here, for instance, we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  997) create a dynevent_cmd on the stack and initialize it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  998) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  999)   struct dynevent_cmd cmd;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1000)   char *buf;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1001)   int ret;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1002) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1003)   buf = kzalloc(MAX_DYNEVENT_CMD_LEN, GFP_KERNEL);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1004) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1005)   dynevent_cmd_init(cmd, buf, maxlen, DYNEVENT_TYPE_FOO,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1006)                     foo_event_run_command);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1007) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1008) The dynevent_cmd initialization needs to be given a user-specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1009) buffer and the length of the buffer (MAX_DYNEVENT_CMD_LEN can be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1010) for this purpose - at 2k it's generally too big to be comfortably put
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1011) on the stack, so is dynamically allocated), a dynevent type id, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1012) is meant to be used to check that further API calls are for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1013) correct command type, and a pointer to an event-specific run_command()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1014) callback that will be called to actually execute the event-specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1015) command function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1016) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1017) Once that's done, the command string can by built up by successive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1018) calls to argument-adding functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1019) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1020) To add a single argument, define and initialize a struct dynevent_arg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1021) or struct dynevent_arg_pair object.  Here's an example of the simplest
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1022) possible arg addition, which is simply to append the given string as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1023) a whitespace-separated argument to the command::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1024) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1025)   struct dynevent_arg arg;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1026) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1027)   dynevent_arg_init(&arg, NULL, 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1028) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1029)   arg.str = name;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1030) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1031)   ret = dynevent_arg_add(cmd, &arg);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1032) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1033) The arg object is first initialized using dynevent_arg_init() and in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1034) this case the parameters are NULL or 0, which means there's no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1035) optional sanity-checking function or separator appended to the end of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1036) the arg.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1037) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1038) Here's another more complicated example using an 'arg pair', which is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1039) used to create an argument that consists of a couple components added
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1040) together as a unit, for example, a 'type field_name;' arg or a simple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1041) expression arg e.g. 'flags=%cx'::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1042) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1043)   struct dynevent_arg_pair arg_pair;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1044) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1045)   dynevent_arg_pair_init(&arg_pair, dynevent_foo_check_arg_fn, 0, ';');
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1046) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1047)   arg_pair.lhs = type;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1048)   arg_pair.rhs = name;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1049) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1050)   ret = dynevent_arg_pair_add(cmd, &arg_pair);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1051) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1052) Again, the arg_pair is first initialized, in this case with a callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1053) function used to check the sanity of the args (for example, that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1054) neither part of the pair is NULL), along with a character to be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1055) to add an operator between the pair (here none) and a separator to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1056) appended onto the end of the arg pair (here ';').
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1057) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1058) There's also a dynevent_str_add() function that can be used to simply
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1059) add a string as-is, with no spaces, delimeters, or arg check.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1060) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1061) Any number of dynevent_*_add() calls can be made to build up the string
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1062) (until its length surpasses cmd->maxlen).  When all the arguments have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1063) been added and the command string is complete, the only thing left to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1064) do is run the command, which happens by simply calling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1065) dynevent_create()::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1066) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1067)   ret = dynevent_create(&cmd);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1068) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1069) At that point, if the return value is 0, the dynamic event has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1070) created and is ready to use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1071) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1072) See the dynevent_cmd function definitions themselves for the details
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1073) of the API.