^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) =============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Cavium ThunderX2 SoC Performance Monitoring Unit (PMU UNCORE)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) The ThunderX2 SoC PMU consists of independent, system-wide, per-socket
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) PMUs such as the Level 3 Cache (L3C), DDR4 Memory Controller (DMC) and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Cavium Coherent Processor Interconnect (CCPI2).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) The DMC has 8 interleaved channels and the L3C has 16 interleaved tiles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Events are counted for the default channel (i.e. channel 0) and prorated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) to the total number of channels/tiles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) The DMC and L3C support up to 4 counters, while the CCPI2 supports up to 8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) counters. Counters are independently programmable to different events and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) can be started and stopped individually. None of the counters support an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) overflow interrupt. DMC and L3C counters are 32-bit and read every 2 seconds.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) The CCPI2 counters are 64-bit and assumed not to overflow in normal operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) PMU UNCORE (perf) driver:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) The thunderx2_pmu driver registers per-socket perf PMUs for the DMC and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) L3C devices. Each PMU can be used to count up to 4 (DMC/L3C) or up to 8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) (CCPI2) events simultaneously. The PMUs provide a description of their
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) available events and configuration options under sysfs, see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) /sys/devices/uncore_<l3c_S/dmc_S/ccpi2_S/>; S is the socket id.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) The driver does not support sampling, therefore "perf record" will not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) work. Per-task perf sessions are also not supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) Examples::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) # perf stat -a -e uncore_dmc_0/cnt_cycles/ sleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) # perf stat -a -e \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) uncore_dmc_0/cnt_cycles/,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) uncore_dmc_0/data_transfers/,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) uncore_dmc_0/read_txns/,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) uncore_dmc_0/write_txns/ sleep 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) # perf stat -a -e \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) uncore_l3c_0/read_request/,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) uncore_l3c_0/read_hit/,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) uncore_l3c_0/inv_request/,\
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) uncore_l3c_0/inv_hit/ sleep 1