Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) SafeSetID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) SafeSetID is an LSM module that gates the setid family of syscalls to restrict
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) UID/GID transitions from a given UID/GID to only those approved by a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) system-wide allowlist. These restrictions also prohibit the given UIDs/GIDs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) allowing a user to set up user namespace UID/GID mappings.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) Background
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) In absence of file capabilities, processes spawned on a Linux system that need
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) to switch to a different user must be spawned with CAP_SETUID privileges.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) CAP_SETUID is granted to programs running as root or those running as a non-root
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) user that have been explicitly given the CAP_SETUID runtime capability. It is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) often preferable to use Linux runtime capabilities rather than file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) capabilities, since using file capabilities to run a program with elevated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) privileges opens up possible security holes since any user with access to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) file can exec() that program to gain the elevated privileges.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) While it is possible to implement a tree of processes by giving full
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) tree of processes under non-root user(s) in the first place. Specifically,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) since CAP_SETUID allows changing to any user on the system, including the root
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) user, it is an overpowered capability for what is needed in this scenario,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) especially since programs often only call setuid() to drop privileges to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) lesser-privileged user -- not elevate privileges. Unfortunately, there is no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) generally feasible way in Linux to restrict the potential UIDs that a user can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) switch to through setuid() beyond allowing a switch to any user on the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) This SafeSetID LSM seeks to provide a solution for restricting setid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) capabilities in such a way.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) The main use case for this LSM is to allow a non-root program to transition to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) other untrusted uids without full blown CAP_SETUID capabilities. The non-root
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) program would still need CAP_SETUID to do any kind of transition, but the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) additional restrictions imposed by this LSM would mean it is a "safer" version
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) do any unapproved actions (e.g. setuid to uid 0 or create/enter new user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) namespace). The higher level goal is to allow for uid-based sandboxing of system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) services without having to give out CAP_SETUID all over the place just so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) non-root programs can drop to even-lesser-privileged uids. This is especially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) relevant when one non-root daemon on the system should be allowed to spawn other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) processes as different uids, but its undesirable to give the daemon a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) basically-root-equivalent CAP_SETUID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) Other Approaches Considered
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) Solve this problem in userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) -------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) For candidate applications that would like to have restricted setid capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) as implemented in this LSM, an alternative option would be to simply take away
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) setid capabilities from the application completely and refactor the process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) spawning semantics in the application (e.g. by using a privileged helper program
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) to do process spawning and UID/GID transitions). Unfortunately, there are a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) number of semantics around process spawning that would be affected by this, such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) as fork() calls where the program doesn't immediately call exec() after the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) fork(), parent processes specifying custom environment variables or command line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) args for spawned child processes, or inheritance of file handles across a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) fork()/exec(). Because of this, as solution that uses a privileged helper in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) userspace would likely be less appealing to incorporate into existing projects
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) that rely on certain process-spawning semantics in Linux.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) Use user namespaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) Another possible approach would be to run a given process tree in its own user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) namespace and give programs in the tree setid capabilities. In this way,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) programs in the tree could change to any desired UID/GID in the context of their
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) own user namespace, and only approved UIDs/GIDs could be mapped back to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) initial system user namespace, affectively preventing privilege escalation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) Unfortunately, it is not generally feasible to use user namespaces in isolation,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) without pairing them with other namespace types, which is not always an option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) Linux checks for capabilities based off of the user namespace that "owns" some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) entity. For example, Linux has the notion that network namespaces are owned by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) the user namespace in which they were created. A consequence of this is that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) capability checks for access to a given network namespace are done by checking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) whether a task has the given capability in the context of the user namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) that owns the network namespace -- not necessarily the user namespace under
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) which the given task runs. Therefore spawning a process in a new user namespace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) effectively prevents it from accessing the network namespace owned by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) initial namespace. This is a deal-breaker for any application that expects to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) retain the CAP_NET_ADMIN capability for the purpose of adjusting network
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) configurations. Using user namespaces in isolation causes problems regarding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) other system interactions, including use of pid namespaces and device creation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) Use an existing LSM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) None of the other in-tree LSMs have the capability to gate setid transitions, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) even employ the security_task_fix_setuid hook at all. SELinux says of that hook:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) "Since setuid only affects the current process, and since the SELinux controls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) are not based on the Linux identity attributes, SELinux does not need to control
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) this operation."
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) Directions for use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) ==================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) This LSM hooks the setid syscalls to make sure transitions are allowed if an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) applicable restriction policy is in place. Policies are configured through
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) securityfs by writing to the safesetid/uid_allowlist_policy and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) safesetid/gid_allowlist_policy files at the location where securityfs is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) mounted. The format for adding a policy is '<UID>:<UID>' or '<GID>:<GID>',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) using literal numbers, and ending with a newline character such as '123:456\n'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) Writing an empty string "" will flush the policy. Again, configuring a policy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) for a UID/GID will prevent that UID/GID from obtaining auxiliary setid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) privileges, such as allowing a user to set up user namespace UID/GID mappings.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) Note on GID policies and setgroups()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) ====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) In v5.9 we are adding support for limiting CAP_SETGID privileges as was done
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) previously for CAP_SETUID. However, for compatibility with common sandboxing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) related code conventions in userspace, we currently allow arbitrary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) setgroups() calls for processes with CAP_SETGID restrictions. Until we add
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) support in a future release for restricting setgroups() calls, these GID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) policies add no meaningful security. setgroups() restrictions will be enforced
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) once we have the policy checking code in place, which will rely on GID policy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) configuration code added in v5.9.