Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  1) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  2) Process Number Controller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  3) =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  5) Abstract
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  6) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  8) The process number controller is used to allow a cgroup hierarchy to stop any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  9) new tasks from being fork()'d or clone()'d after a certain limit is reached.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) Since it is trivial to hit the task limit without hitting any kmemcg limits in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) place, PIDs are a fundamental resource. As such, PID exhaustion must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) preventable in the scope of a cgroup hierarchy by allowing resource limiting of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) the number of tasks in a cgroup.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) Usage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) -----
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) In order to use the `pids` controller, set the maximum number of tasks in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) pids.max (this is not available in the root cgroup for obvious reasons). The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) number of processes currently in the cgroup is given by pids.current.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) Organisational operations are not blocked by cgroup policies, so it is possible
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) to have pids.current > pids.max. This can be done by either setting the limit to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) be smaller than pids.current, or attaching enough processes to the cgroup such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) that pids.current > pids.max. However, it is not possible to violate a cgroup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) policy through fork() or clone(). fork() and clone() will return -EAGAIN if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) creation of a new process would cause a cgroup policy to be violated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) To set a cgroup to have no limit, set pids.max to "max". This is the default for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) all new cgroups (N.B. that PID limits are hierarchical, so the most stringent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) limit in the hierarchy is followed).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) pids.current tracks all child cgroup hierarchies, so parent/pids.current is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) superset of parent/child/pids.current.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) The pids.events file contains event counters:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)   - max: Number of times fork failed because limit was hit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) Example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) First, we mount the pids controller::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) 	# mkdir -p /sys/fs/cgroup/pids
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) 	# mount -t cgroup -o pids none /sys/fs/cgroup/pids
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) Then we create a hierarchy, set limits and attach processes to it::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) 	# mkdir -p /sys/fs/cgroup/pids/parent/child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) 	# echo 2 > /sys/fs/cgroup/pids/parent/pids.max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) 	# echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) 	# cat /sys/fs/cgroup/pids/parent/pids.current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) 	2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) 	#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) It should be noted that attempts to overcome the set limit (2 in this case) will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) fail::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) 	# cat /sys/fs/cgroup/pids/parent/pids.current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) 	2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) 	# ( /bin/echo "Here's some processes for you." | cat )
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) 	sh: fork: Resource temporary unavailable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) 	#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) Even if we migrate to a child cgroup (which doesn't have a set limit), we will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) not be able to overcome the most stringent limit in the hierarchy (in this case,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) parent's)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) 	# echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) 	# cat /sys/fs/cgroup/pids/parent/pids.current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) 	2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) 	# cat /sys/fs/cgroup/pids/parent/child/pids.current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) 	2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) 	# cat /sys/fs/cgroup/pids/parent/child/pids.max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) 	max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) 	# ( /bin/echo "Here's some processes for you." | cat )
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) 	sh: fork: Resource temporary unavailable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) 	#
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) We can set a limit that is smaller than pids.current, which will stop any new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) processes from being forked at all (note that the shell itself counts towards
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) pids.current)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) 	# echo 1 > /sys/fs/cgroup/pids/parent/pids.max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) 	# /bin/echo "We can't even spawn a single process now."
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) 	sh: fork: Resource temporary unavailable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) 	# echo 0 > /sys/fs/cgroup/pids/parent/pids.max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) 	# /bin/echo "We can't even spawn a single process now."
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) 	sh: fork: Resource temporary unavailable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) 	#