^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Credentials in Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) By: David Howells <dhowells@redhat.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) .. contents:: :local:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) There are several parts to the security check performed by Linux when one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) object acts upon another:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) 1. Objects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Objects are things in the system that may be acted upon directly by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) userspace programs. Linux has a variety of actionable objects, including:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) - Tasks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) - Files/inodes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) - Sockets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) - Message queues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) - Shared memory segments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) - Semaphores
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) - Keys
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) As a part of the description of all these objects there is a set of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) credentials. What's in the set depends on the type of object.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) 2. Object ownership.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) Amongst the credentials of most objects, there will be a subset that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) indicates the ownership of that object. This is used for resource
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) accounting and limitation (disk quotas and task rlimits for example).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) In a standard UNIX filesystem, for instance, this will be defined by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) UID marked on the inode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) 3. The objective context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) Also amongst the credentials of those objects, there will be a subset that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) indicates the 'objective context' of that object. This may or may not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) the same set as in (2) - in standard UNIX files, for instance, this is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) defined by the UID and the GID marked on the inode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) The objective context is used as part of the security calculation that is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) carried out when an object is acted upon.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) 4. Subjects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) A subject is an object that is acting upon another object.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) Most of the objects in the system are inactive: they don't act on other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) objects within the system. Processes/tasks are the obvious exception:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) they do stuff; they access and manipulate things.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) Objects other than tasks may under some circumstances also be subjects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) For instance an open file may send SIGIO to a task using the UID and EUID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) given to it by a task that called ``fcntl(F_SETOWN)`` upon it. In this case,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) the file struct will have a subjective context too.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) 5. The subjective context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) A subject has an additional interpretation of its credentials. A subset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) of its credentials forms the 'subjective context'. The subjective context
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) is used as part of the security calculation that is carried out when a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) subject acts.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) A Linux task, for example, has the FSUID, FSGID and the supplementary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) group list for when it is acting upon a file - which are quite separate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) from the real UID and GID that normally form the objective context of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) task.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) 6. Actions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) Linux has a number of actions available that a subject may perform upon an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) object. The set of actions available depends on the nature of the subject
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) and the object.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) Actions include reading, writing, creating and deleting files; forking or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) signalling and tracing tasks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) 7. Rules, access control lists and security calculations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) When a subject acts upon an object, a security calculation is made. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) involves taking the subjective context, the objective context and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) action, and searching one or more sets of rules to see whether the subject
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) is granted or denied permission to act in the desired manner on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) object, given those contexts.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) There are two main sources of rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) a. Discretionary access control (DAC):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) Sometimes the object will include sets of rules as part of its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) description. This is an 'Access Control List' or 'ACL'. A Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) file may supply more than one ACL.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) A traditional UNIX file, for example, includes a permissions mask that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) is an abbreviated ACL with three fixed classes of subject ('user',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 'group' and 'other'), each of which may be granted certain privileges
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) ('read', 'write' and 'execute' - whatever those map to for the object
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) in question). UNIX file permissions do not allow the arbitrary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) specification of subjects, however, and so are of limited use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) A Linux file might also sport a POSIX ACL. This is a list of rules
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) that grants various permissions to arbitrary subjects.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) b. Mandatory access control (MAC):
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) The system as a whole may have one or more sets of rules that get
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) applied to all subjects and objects, regardless of their source.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) SELinux and Smack are examples of this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) In the case of SELinux and Smack, each object is given a label as part
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) of its credentials. When an action is requested, they take the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) subject label, the object label and the action and look for a rule
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) that says that this action is either granted or denied.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) Types of Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) The Linux kernel supports the following types of credentials:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 1. Traditional UNIX credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) - Real User ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) - Real Group ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) The UID and GID are carried by most, if not all, Linux objects, even if in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) some cases it has to be invented (FAT or CIFS files for example, which are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) derived from Windows). These (mostly) define the objective context of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) that object, with tasks being slightly different in some cases.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) - Effective, Saved and FS User ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) - Effective, Saved and FS Group ID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) - Supplementary groups
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) These are additional credentials used by tasks only. Usually, an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) EUID/EGID/GROUPS will be used as the subjective context, and real UID/GID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) will be used as the objective. For tasks, it should be noted that this is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) not always true.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 2. Capabilities.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) - Set of permitted capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) - Set of inheritable capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) - Set of effective capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) - Capability bounding set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) These are only carried by tasks. They indicate superior capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) granted piecemeal to a task that an ordinary task wouldn't otherwise have.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) These are manipulated implicitly by changes to the traditional UNIX
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) credentials, but can also be manipulated directly by the ``capset()``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) system call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) The permitted capabilities are those caps that the process might grant
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) itself to its effective or permitted sets through ``capset()``. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) inheritable set might also be so constrained.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) The effective capabilities are the ones that a task is actually allowed to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) make use of itself.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) The inheritable capabilities are the ones that may get passed across
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) ``execve()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) The bounding set limits the capabilities that may be inherited across
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) ``execve()``, especially when a binary is executed that will execute as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) UID 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 3. Secure management flags (securebits).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) These are only carried by tasks. These govern the way the above
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) credentials are manipulated and inherited over certain operations such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) execve(). They aren't used directly as objective or subjective
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) 4. Keys and keyrings.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) These are only carried by tasks. They carry and cache security tokens
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) that don't fit into the other standard UNIX credentials. They are for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) making such things as network filesystem keys available to the file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) accesses performed by processes, without the necessity of ordinary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) programs having to know about security details involved.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) Keyrings are a special type of key. They carry sets of other keys and can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) be searched for the desired key. Each process may subscribe to a number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) of keyrings:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) Per-thread keying
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) Per-process keyring
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) Per-session keyring
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) When a process accesses a key, if not already present, it will normally be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) cached on one of these keyrings for future accesses to find.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) For more information on using keys, see ``Documentation/security/keys/*``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) 5. LSM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) The Linux Security Module allows extra controls to be placed over the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) operations that a task may do. Currently Linux supports several LSM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) options.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) Some work by labelling the objects in a system and then applying sets of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) rules (policies) that say what operations a task with one label may do to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) an object with another label.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) 6. AF_KEY
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) This is a socket-based approach to credential management for networking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) stacks [RFC 2367]. It isn't discussed by this document as it doesn't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) interact directly with task and file credentials; rather it keeps system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) level credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) When a file is opened, part of the opening task's subjective context is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) recorded in the file struct created. This allows operations using that file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) struct to use those credentials instead of the subjective context of the task
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) that issued the operation. An example of this would be a file opened on a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) network filesystem where the credentials of the opened file should be presented
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) to the server, regardless of who is actually doing a read or a write upon it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) File Markings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) =============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) Files on disk or obtained over the network may have annotations that form the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) objective security context of that file. Depending on the type of filesystem,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) this may include one or more of the following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) * UNIX UID, GID, mode;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) * Windows user ID;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) * Access control list;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) * LSM security label;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) * UNIX exec privilege escalation bits (SUID/SGID);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) * File capabilities exec privilege escalation bits.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) These are compared to the task's subjective security context, and certain
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) operations allowed or disallowed as a result. In the case of execve(), the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) privilege escalation bits come into play, and may allow the resulting process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) extra privileges, based on the annotations on the executable file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) Task Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) In Linux, all of a task's credentials are held in (uid, gid) or through
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) (groups, keys, LSM security) a refcounted structure of type 'struct cred'.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) Each task points to its credentials by a pointer called 'cred' in its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) task_struct.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) Once a set of credentials has been prepared and committed, it may not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) changed, barring the following exceptions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) 1. its reference count may be changed;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) 2. the reference count on the group_info struct it points to may be changed;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) 3. the reference count on the security data it points to may be changed;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) 4. the reference count on any keyrings it points to may be changed;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) 5. any keyrings it points to may be revoked, expired or have their security
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) attributes changed; and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) 6. the contents of any keyrings to which it points may be changed (the whole
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) point of keyrings being a shared set of credentials, modifiable by anyone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) with appropriate access).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) To alter anything in the cred struct, the copy-and-replace principle must be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) adhered to. First take a copy, then alter the copy and then use RCU to change
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) the task pointer to make it point to the new copy. There are wrappers to aid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) with this (see below).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) A task may only alter its _own_ credentials; it is no longer permitted for a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) task to alter another's credentials. This means the ``capset()`` system call
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) is no longer permitted to take any PID other than the one of the current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) process. Also ``keyctl_instantiate()`` and ``keyctl_negate()`` functions no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) longer permit attachment to process-specific keyrings in the requesting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) process as the instantiating process may need to create them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) Immutable Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) Once a set of credentials has been made public (by calling ``commit_creds()``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) for example), it must be considered immutable, barring two exceptions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) 1. The reference count may be altered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) 2. While the keyring subscriptions of a set of credentials may not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) changed, the keyrings subscribed to may have their contents altered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) To catch accidental credential alteration at compile time, struct task_struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) has _const_ pointers to its credential sets, as does struct file. Furthermore,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) certain functions such as ``get_cred()`` and ``put_cred()`` operate on const
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) pointers, thus rendering casts unnecessary, but require to temporarily ditch
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) the const qualification to be able to alter the reference count.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) Accessing Task Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) A task being able to alter only its own credentials permits the current process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) to read or replace its own credentials without the need for any form of locking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) -- which simplifies things greatly. It can just call::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) const struct cred *current_cred()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) to get a pointer to its credentials structure, and it doesn't have to release
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) it afterwards.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) There are convenience wrappers for retrieving specific aspects of a task's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) credentials (the value is simply returned in each case)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) uid_t current_uid(void) Current's real UID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) gid_t current_gid(void) Current's real GID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) uid_t current_euid(void) Current's effective UID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) gid_t current_egid(void) Current's effective GID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) uid_t current_fsuid(void) Current's file access UID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) gid_t current_fsgid(void) Current's file access GID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) kernel_cap_t current_cap(void) Current's effective capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) struct user_struct *current_user(void) Current's user account
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) There are also convenience wrappers for retrieving specific associated pairs of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) a task's credentials::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) void current_uid_gid(uid_t *, gid_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) void current_euid_egid(uid_t *, gid_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) void current_fsuid_fsgid(uid_t *, gid_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) which return these pairs of values through their arguments after retrieving
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) them from the current task's credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) In addition, there is a function for obtaining a reference on the current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) process's current set of credentials::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) const struct cred *get_current_cred(void);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) and functions for getting references to one of the credentials that don't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) actually live in struct cred::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) struct user_struct *get_current_user(void);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) struct group_info *get_current_groups(void);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) which get references to the current process's user accounting structure and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) supplementary groups list respectively.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) Once a reference has been obtained, it must be released with ``put_cred()``,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) ``free_uid()`` or ``put_group_info()`` as appropriate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) Accessing Another Task's Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) ------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) While a task may access its own credentials without the need for locking, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) same is not true of a task wanting to access another task's credentials. It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) must use the RCU read lock and ``rcu_dereference()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) The ``rcu_dereference()`` is wrapped by::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) const struct cred *__task_cred(struct task_struct *task);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) This should be used inside the RCU read lock, as in the following example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) void foo(struct task_struct *t, struct foo_data *f)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) const struct cred *tcred;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) rcu_read_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) tcred = __task_cred(t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) f->uid = tcred->uid;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) f->gid = tcred->gid;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) f->groups = get_group_info(tcred->groups);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) rcu_read_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) Should it be necessary to hold another task's credentials for a long period of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) time, and possibly to sleep while doing so, then the caller should get a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) reference on them using::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) const struct cred *get_task_cred(struct task_struct *task);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) This does all the RCU magic inside of it. The caller must call put_cred() on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) the credentials so obtained when they're finished with.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) The result of ``__task_cred()`` should not be passed directly to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) ``get_cred()`` as this may race with ``commit_cred()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) There are a couple of convenience functions to access bits of another task's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) credentials, hiding the RCU magic from the caller::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) uid_t task_uid(task) Task's real UID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) uid_t task_euid(task) Task's effective UID
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) If the caller is holding the RCU read lock at the time anyway, then::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) __task_cred(task)->uid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) __task_cred(task)->euid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) should be used instead. Similarly, if multiple aspects of a task's credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) need to be accessed, RCU read lock should be used, ``__task_cred()`` called,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) the result stored in a temporary pointer and then the credential aspects called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) from that before dropping the lock. This prevents the potentially expensive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) RCU magic from being invoked multiple times.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) Should some other single aspect of another task's credentials need to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) accessed, then this can be used::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) task_cred_xxx(task, member)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) where 'member' is a non-pointer member of the cred struct. For instance::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) uid_t task_cred_xxx(task, suid);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) will retrieve 'struct cred::suid' from the task, doing the appropriate RCU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) magic. This may not be used for pointer members as what they point to may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) disappear the moment the RCU read lock is dropped.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) Altering Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) As previously mentioned, a task may only alter its own credentials, and may not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) alter those of another task. This means that it doesn't need to use any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) locking to alter its own credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) To alter the current process's credentials, a function should first prepare a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) new set of credentials by calling::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) struct cred *prepare_creds(void);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) this locks current->cred_replace_mutex and then allocates and constructs a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) duplicate of the current process's credentials, returning with the mutex still
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) held if successful. It returns NULL if not successful (out of memory).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) The mutex prevents ``ptrace()`` from altering the ptrace state of a process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) while security checks on credentials construction and changing is taking place
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) as the ptrace state may alter the outcome, particularly in the case of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) ``execve()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) The new credentials set should be altered appropriately, and any security
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) checks and hooks done. Both the current and the proposed sets of credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) are available for this purpose as current_cred() will return the current set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) still at this point.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) When replacing the group list, the new list must be sorted before it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) is added to the credential, as a binary search is used to test for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) membership. In practice, this means groups_sort() should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) called before set_groups() or set_current_groups().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) groups_sort() must not be called on a ``struct group_list`` which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) is shared as it may permute elements as part of the sorting process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) even if the array is already sorted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) When the credential set is ready, it should be committed to the current process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) by calling::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) int commit_creds(struct cred *new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) This will alter various aspects of the credentials and the process, giving the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) LSM a chance to do likewise, then it will use ``rcu_assign_pointer()`` to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) actually commit the new credentials to ``current->cred``, it will release
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) ``current->cred_replace_mutex`` to allow ``ptrace()`` to take place, and it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) will notify the scheduler and others of the changes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) This function is guaranteed to return 0, so that it can be tail-called at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) end of such functions as ``sys_setresuid()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475) Note that this function consumes the caller's reference to the new credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) The caller should _not_ call ``put_cred()`` on the new credentials afterwards.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) Furthermore, once this function has been called on a new set of credentials,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) those credentials may _not_ be changed further.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) Should the security checks fail or some other error occur after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) ``prepare_creds()`` has been called, then the following function should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) invoked::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) void abort_creds(struct cred *new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) This releases the lock on ``current->cred_replace_mutex`` that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) ``prepare_creds()`` got and then releases the new credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) A typical credentials alteration function would look something like this::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) int alter_suid(uid_t suid)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) struct cred *new;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) int ret;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499) new = prepare_creds();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500) if (!new)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) return -ENOMEM;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) new->suid = suid;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504) ret = security_alter_suid(new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) if (ret < 0) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) abort_creds(new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) return ret;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) return commit_creds(new);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) Managing Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) There are some functions to help manage credentials:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) - ``void put_cred(const struct cred *cred);``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) This releases a reference to the given set of credentials. If the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) reference count reaches zero, the credentials will be scheduled for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) destruction by the RCU system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525) - ``const struct cred *get_cred(const struct cred *cred);``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) This gets a reference on a live set of credentials, returning a pointer to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) that set of credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) - ``struct cred *get_new_cred(struct cred *cred);``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532) This gets a reference on a set of credentials that is under construction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533) and is thus still mutable, returning a pointer to that set of credentials.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) Open File Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539) When a new file is opened, a reference is obtained on the opening task's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) credentials and this is attached to the file struct as ``f_cred`` in place of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541) ``f_uid`` and ``f_gid``. Code that used to access ``file->f_uid`` and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) ``file->f_gid`` should now access ``file->f_cred->fsuid`` and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543) ``file->f_cred->fsgid``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545) It is safe to access ``f_cred`` without the use of RCU or locking because the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546) pointer will not change over the lifetime of the file struct, and nor will the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) contents of the cred struct pointed to, barring the exceptions listed above
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548) (see the Task Credentials section).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550) To avoid "confused deputy" privilege escalation attacks, access control checks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551) during subsequent operations on an opened file should use these credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552) instead of "current"'s credentials, as the file may have been passed to a more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) privileged process.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555) Overriding the VFS's Use of Credentials
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556) =======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) Under some circumstances it is desirable to override the credentials used by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559) the VFS, and that can be done by calling into such as ``vfs_mkdir()`` with a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) different set of credentials. This is done in the following places:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562) * ``sys_faccessat()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563) * ``do_coredump()``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564) * nfs4recover.c.