^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) Overhead calculation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) The overhead can be shown in two columns as 'Children' and 'Self' when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) perf collects callchains. The 'self' overhead is simply calculated by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) adding all period values of the entry - usually a function (symbol).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) This is the value that perf shows traditionally and sum of all the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) 'self' overhead values should be 100%.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) The 'children' overhead is calculated by adding all period values of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) the child functions so that it can show the total overhead of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) higher level functions even if they don't directly execute much.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) 'Children' here means functions that are called from another (parent)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) It might be confusing that the sum of all the 'children' overhead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) values exceeds 100% since each of them is already an accumulation of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) 'self' overhead of its child functions. But with this enabled, users
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) can find which function has the most overhead even if samples are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) spread over the children.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) Consider the following example; there are three functions like below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) void foo(void) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) /* do something */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) void bar(void) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) /* do something */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) foo();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) int main(void) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) bar()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) -----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) In this case 'foo' is a child of 'bar', and 'bar' is an immediate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) child of 'main' so 'foo' also is a child of 'main'. In other words,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) 'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) Suppose all samples are recorded in 'foo' and 'bar' only. When it's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) recorded with callchains the output will show something like below
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) in the usual (self-overhead-only) output of perf report:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) ----------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) Overhead Symbol
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) ........ .....................
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) 60.00% foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) --- foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) 40.00% bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) --- bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) ----------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) When the --children option is enabled, the 'self' overhead values of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) child functions (i.e. 'foo' and 'bar') are added to the parents to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) calculate the 'children' overhead. In this case the report could be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) displayed as:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) -------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) Children Self Symbol
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) ........ ........ ....................
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) 100.00% 0.00% __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) --- __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) 100.00% 0.00% main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) --- main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) 100.00% 40.00% bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) --- bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) 60.00% 60.00% foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) --- foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) bar
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) __libc_start_main
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) -------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) In the above output, the 'self' overhead of 'foo' (60%) was add to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) 'children' overhead of 'bar', 'main' and '\_\_libc_start_main'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) Likewise, the 'self' overhead of 'bar' (40%) was added to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) 'children' overhead of 'main' and '\_\_libc_start_main'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) So '\_\_libc_start_main' and 'main' are shown first since they have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) same (100%) 'children' overhead (even though they have zero 'self'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) overhead) and they are the parents of 'foo' and 'bar'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) Since v3.16 the 'children' overhead is shown by default and the output
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) is sorted by its values. The 'children' overhead is disabled by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) specifying --no-children option on the command line or by adding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 'report.children = false' or 'top.children = false' in the perf config
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) file.