^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) CPU Accounting Controller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) The CPU accounting controller is used to group tasks using cgroups and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) account the CPU usage of these groups of tasks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) The CPU accounting controller supports multi-hierarchy groups. An accounting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) group accumulates the CPU usage of all of its child groups and the tasks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) directly present in its group.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) Accounting groups can be created by first mounting the cgroup filesystem::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) # mount -t cgroup -ocpuacct none /sys/fs/cgroup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) With the above step, the initial or the parent accounting group becomes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) visible at /sys/fs/cgroup. At bootup, this group includes all the tasks in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) the system. /sys/fs/cgroup/tasks lists the tasks in this cgroup.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) /sys/fs/cgroup/cpuacct.usage gives the CPU time (in nanoseconds) obtained
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) by this group which is essentially the CPU time obtained by all the tasks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) in the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) New accounting groups can be created under the parent group /sys/fs/cgroup::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) # cd /sys/fs/cgroup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) # mkdir g1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) # echo $$ > g1/tasks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) The above steps create a new group g1 and move the current shell
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) process (bash) into it. CPU time consumed by this bash and its children
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) can be obtained from g1/cpuacct.usage and the same is accumulated in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) /sys/fs/cgroup/cpuacct.usage also.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) cpuacct.stat file lists a few statistics which further divide the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) CPU time obtained by the cgroup into user and system times. Currently
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) the following statistics are supported:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) user: Time spent by tasks of the cgroup in user mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) system: Time spent by tasks of the cgroup in kernel mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) user and system are in USER_HZ unit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) cpuacct controller uses percpu_counter interface to collect user and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) system times. This has two side effects:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) - It is theoretically possible to see wrong values for user and system times.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) This is because percpu_counter_read() on 32bit systems isn't safe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) against concurrent writes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) - It is possible to see slightly outdated values for user and system times
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) due to the batch processing nature of percpu_counter.