Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) BPF_PROG_TYPE_CGROUP_SYSCTL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) This document describes ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program type that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) provides cgroup-bpf hook for sysctl.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) The hook has to be attached to a cgroup and will be called every time a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) process inside that cgroup tries to read from or write to sysctl knob in proc.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 1. Attach type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) **************
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) ``BPF_CGROUP_SYSCTL`` attach type has to be used to attach
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program to a cgroup.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 2. Context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) **********
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) ``BPF_PROG_TYPE_CGROUP_SYSCTL`` provides access to the following context from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) BPF program::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25)     struct bpf_sysctl {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26)         __u32 write;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27)         __u32 file_pos;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28)     };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) * ``write`` indicates whether sysctl value is being read (``0``) or written
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31)   (``1``). This field is read-only.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) * ``file_pos`` indicates file position sysctl is being accessed at, read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34)   or written. This field is read-write. Writing to the field sets the starting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35)   position in sysctl proc file ``read(2)`` will be reading from or ``write(2)``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)   will be writing to. Writing zero to the field can be used e.g. to override
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37)   whole sysctl value by ``bpf_sysctl_set_new_value()`` on ``write(2)`` even
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38)   when it's called by user space on ``file_pos > 0``. Writing non-zero
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39)   value to the field can be used to access part of sysctl value starting from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40)   specified ``file_pos``. Not all sysctl support access with ``file_pos !=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41)   0``, e.g. writes to numeric sysctl entries must always be at file position
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42)   ``0``. See also ``kernel.sysctl_writes_strict`` sysctl.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) See `linux/bpf.h`_ for more details on how context field can be accessed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 3. Return code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) **************
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program must return one of the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) return codes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) * ``0`` means "reject access to sysctl";
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) * ``1`` means "proceed with access".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) If program returns ``0`` user space will get ``-1`` from ``read(2)`` or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) ``write(2)`` and ``errno`` will be set to ``EPERM``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) 4. Helpers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) **********
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) Since sysctl knob is represented by a name and a value, sysctl specific BPF
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) helpers focus on providing access to these properties:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) * ``bpf_sysctl_get_name()`` to get sysctl name as it is visible in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65)   ``/proc/sys`` into provided by BPF program buffer;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) * ``bpf_sysctl_get_current_value()`` to get string value currently held by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68)   sysctl into provided by BPF program buffer. This helper is available on both
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69)   ``read(2)`` from and ``write(2)`` to sysctl;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) * ``bpf_sysctl_get_new_value()`` to get new string value currently being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72)   written to sysctl before actual write happens. This helper can be used only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73)   on ``ctx->write == 1``;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) * ``bpf_sysctl_set_new_value()`` to override new string value currently being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76)   written to sysctl before actual write happens. Sysctl value will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77)   overridden starting from the current ``ctx->file_pos``. If the whole value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78)   has to be overridden BPF program can set ``file_pos`` to zero before calling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79)   to the helper. This helper can be used only on ``ctx->write == 1``. New
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80)   string value set by the helper is treated and verified by kernel same way as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81)   an equivalent string passed by user space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) BPF program sees sysctl value same way as user space does in proc filesystem,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) i.e. as a string. Since many sysctl values represent an integer or a vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) of integers, the following helpers can be used to get numeric value from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) string:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) * ``bpf_strtol()`` to convert initial part of the string to long integer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89)   similar to user space `strtol(3)`_;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) * ``bpf_strtoul()`` to convert initial part of the string to unsigned long
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91)   integer similar to user space `strtoul(3)`_;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) See `linux/bpf.h`_ for more details on helpers described here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 5. Examples
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) ***********
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) See `test_sysctl_prog.c`_ for an example of BPF program in C that access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) sysctl name and value, parses string value to get vector of integers and uses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) the result to make decision whether to allow or deny access to sysctl.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 6. Notes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) ********
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) ``BPF_PROG_TYPE_CGROUP_SYSCTL`` is intended to be used in **trusted** root
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) environment, for example to monitor sysctl usage or catch unreasonable values
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) an application, running as root in a separate cgroup, is trying to set.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) Since `task_dfl_cgroup(current)` is called at `sys_read` / `sys_write` time it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) may return results different from that at `sys_open` time, i.e. process that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) opened sysctl file in proc filesystem may differ from process that is trying
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) to read from / write to it and two such processes may run in different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) cgroups, what means ``BPF_PROG_TYPE_CGROUP_SYSCTL`` should not be used as a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) security mechanism to limit sysctl usage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) As with any cgroup-bpf program additional care should be taken if an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) application running as root in a cgroup should not be allowed to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) detach/replace BPF program attached by administrator.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) .. Links
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) .. _linux/bpf.h: ../../include/uapi/linux/bpf.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) .. _strtol(3): http://man7.org/linux/man-pages/man3/strtol.3p.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) .. _strtoul(3): http://man7.org/linux/man-pages/man3/strtoul.3p.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) .. _test_sysctl_prog.c:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)    ../../tools/testing/selftests/bpf/progs/test_sysctl_prog.c