^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) perf-c2c(1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) NAME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) perf-c2c - Shared Data C2C/HITM Analyzer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) SYNOPSIS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) [verse]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) 'perf c2c record' [<options>] <command>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) 'perf c2c record' [<options>] -- [<record command options>] <command>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) 'perf c2c report' [<options>]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) DESCRIPTION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) C2C stands for Cache To Cache.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) you to track down the cacheline contentions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) On x86, the tool is based on load latency and precise store facility events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) with thresholding feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) These events provide:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) - memory address of the access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) - type of the access (load and store details)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) - latency (in cycles) of the load access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) The c2c tool provide means to record this data and report back access details
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) for cachelines with highest contention - highest number of HITM accesses.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) The basic workflow with this tool follows the standard record/report phase.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) User uses the record command to record events data and report command to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) display it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) RECORD OPTIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) -e::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) --event=::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) Select the PMU event. Use 'perf c2c record -e list'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) to list available events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) -v::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) --verbose::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) Be more verbose (show counter open errors, etc).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) -l::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) --ldlat::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) Configure mem-loads latency. (x86 only)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) -k::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) --all-kernel::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) Configure all used events to run in kernel space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) -u::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) --all-user::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) Configure all used events to run in user space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) REPORT OPTIONS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) -k::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) --vmlinux=<file>::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) vmlinux pathname
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) -v::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) --verbose::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) Be more verbose (show counter open errors, etc).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) -i::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) --input::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) Specify the input file to process.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) -N::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) --node-info::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) Show extra node info in report (see NODE INFO section)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) -c::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) --coalesce::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) Specify sorting fields for single cacheline display.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) Following fields are available: tid,pid,iaddr,dso
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) (see COALESCE)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) -g::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) --call-graph::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) Setup callchains parameters.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) Please refer to perf-report man page for details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) --stdio::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) Force the stdio output (see STDIO OUTPUT)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) --stats::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) Display only statistic tables and force stdio mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) --full-symbols::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) Display full length of symbols.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) --no-source::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) Do not display Source:Line column.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) --show-all::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) Show all captured HITM lines, with no regard to HITM % 0.0005 limit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) -f::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) --force::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) Don't do ownership validation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) -d::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) --display::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) Switch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) --stitch-lbr::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) Show callgraph with stitched LBRs, which may have more complete
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) callgraph. The perf.data file must have been obtained using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) perf c2c record --call-graph lbr.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) Disabled by default. In common cases with call stack overflows,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) it can recreate better call stacks than the default lbr call stack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) output. But this approach is not full proof. There can be cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) where it creates incorrect call stacks from incorrect matches.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) The known limitations include exception handing such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) setjmp/longjmp will have calls/returns not match.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) C2C RECORD
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) The perf c2c record command setup options related to HITM cacheline analysis
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) and calls standard perf record command.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) Following perf record options are configured by default:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) (check perf record man page for details)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) -W,-d,--phys-data,--sample-cpu
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) Unless specified otherwise with '-e' option, following events are monitored by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) default on x86:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) cpu/mem-loads,ldlat=30/P
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) cpu/mem-stores/P
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) and following on PowerPC:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) cpu/mem-loads/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) cpu/mem-stores/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) User can pass any 'perf record' option behind '--' mark, like (to enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) callchains and system wide monitoring):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) $ perf c2c record -- -g -a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) Please check RECORD OPTIONS section for specific c2c record options.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) C2C REPORT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) The perf c2c report command displays shared data analysis. It comes in two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) display modes: stdio and tui (default).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) The report command workflow is following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) - sort all the data based on the cacheline address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) - store access details for each cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) - sort all cachelines based on user settings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) - display data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) In general perf report output consist of 2 basic views:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) 1) most expensive cachelines list
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 2) offsets details for each cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) For each cacheline in the 1) list we display following data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) (Both stdio and TUI modes follow the same fields output)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) Index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) - zero based index to identify the cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) Cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) - cacheline address (hex number)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) Rmt/Lcl Hitm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) - cacheline percentage of all Remote/Local HITM accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) LLC Load Hitm - Total, LclHitm, RmtHitm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) - count of Total/Local/Remote load HITMs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) Total records
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) - sum of all cachelines accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) Total loads
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) - sum of all load accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) Total stores
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) - sum of all store accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) Store Reference - L1Hit, L1Miss
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) L1Hit - store accesses that hit L1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) L1Miss - store accesses that missed L1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) Core Load Hit - FB, L1, L2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) - count of load hits in FB (Fill Buffer), L1 and L2 cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) LLC Load Hit - LlcHit, LclHitm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) - count of LLC load accesses, includes LLC hits and LLC HITMs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) RMT Load Hit - RmtHit, RmtHitm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) - count of remote load accesses, includes remote hits and remote HITMs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) Load Dram - Lcl, Rmt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) - count of local and remote DRAM accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) For each offset in the 2) list we display following data:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) HITM - Rmt, Lcl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) - % of Remote/Local HITM accesses for given offset within cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) Store Refs - L1 Hit, L1 Miss
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) - % of store accesses that hit/missed L1 for given offset within cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) Data address - Offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) - offset address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) Pid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) - pid of the process responsible for the accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) Tid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) - tid of the process responsible for the accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) Code address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) - code address responsible for the accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) cycles - rmt hitm, lcl hitm, load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) - sum of cycles for given accesses - Remote/Local HITM and generic load
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) cpu cnt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) - number of cpus that participated on the access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) Symbol
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) - code symbol related to the 'Code address' value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) Shared Object
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) - shared object name related to the 'Code address' value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) Source:Line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) - source information related to the 'Code address' value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) Node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) - nodes participating on the access (see NODE INFO section)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) NODE INFO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) The 'Node' field displays nodes that accesses given cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) offset. Its output comes in 3 flavors:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) - node IDs separated by ','
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) - node IDs with stats for each ID, in following format:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) Node{cpus %hitms %stores}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) - node IDs with list of affected CPUs in following format:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) Node{cpu list}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) User can switch between above flavors with -N option or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) use 'n' key to interactively switch in TUI mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) COALESCE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) User can specify how to sort offsets for cacheline.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) Following fields are available and governs the final
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) output fields set for caheline offsets output:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) tid - coalesced by process TIDs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) pid - coalesced by process PIDs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) iaddr - coalesced by code address, following fields are displayed:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) Code address, Code symbol, Shared Object, Source line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) dso - coalesced by shared object
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) By default the coalescing is setup with 'pid,iaddr'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) STDIO OUTPUT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) The stdio output displays data on standard output.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) Following tables are displayed:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) Trace Event Information
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) - overall statistics of memory accesses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) Global Shared Cache Line Event Information
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) - overall statistics on shared cachelines
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) Shared Data Cache Line Table
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) - list of most expensive cachelines
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) Shared Cache Line Distribution Pareto
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) - list of all accessed offsets for each cacheline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) TUI OUTPUT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) The TUI output provides interactive interface to navigate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) through cachelines list and to display offset details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) For details please refer to the help window by pressing '?' key.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) CREDITS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) Although Don Zickus, Dick Fowles and Joe Mario worked together
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) to get this implemented, we got lots of early help from Arnaldo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) Carvalho de Melo, Stephane Eranian, Jiri Olsa and Andi Kleen.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) C2C BLOG
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) Check Joe's blog on c2c tool for detailed use case explanation:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) https://joemario.github.io/blog/2016/09/01/c2c-blog/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) SEE ALSO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) linkperf:perf-record[1], linkperf:perf-mem[1]