Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) Documentation for /proc/sys/fs/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ===============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) kernel version 2.2.10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) Copyright (c) 1998, 1999,  Rik van Riel <riel@nl.linux.org>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) Copyright (c) 2009,        Shen Feng<shen@cn.fujitsu.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) For general info and legal blurb, please look in intro.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) ------------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) This file contains documentation for the sysctl files in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) /proc/sys/fs/ and is valid for Linux kernel version 2.2.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) The files in this directory can be used to tune and monitor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) miscellaneous and general things in the operation of the Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) kernel. Since some of the files _can_ be used to screw up your
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) system, it is advisable to read both documentation and source
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) before actually making adjustments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 1. /proc/sys/fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) Currently, these files are in /proc/sys/fs:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) - aio-max-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) - aio-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) - dentry-state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) - dquot-max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) - dquot-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) - file-max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) - file-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) - inode-max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) - inode-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) - inode-state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) - nr_open
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) - overflowuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) - overflowgid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) - pipe-user-pages-hard
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) - pipe-user-pages-soft
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) - protected_fifos
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) - protected_hardlinks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) - protected_regular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) - protected_symlinks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) - suid_dumpable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) - super-max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) - super-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) aio-nr & aio-max-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) aio-nr is the running total of the number of events specified on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) io_setup system call for all currently active aio contexts.  If aio-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) reaches aio-max-nr then io_setup will fail with EAGAIN.  Note that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) raising aio-max-nr does not result in the pre-allocation or re-sizing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) of any kernel data structures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) dentry-state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) ------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) From linux/include/linux/dcache.h::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68)   struct dentry_stat_t dentry_stat {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69)         int nr_dentry;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70)         int nr_unused;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71)         int age_limit;         /* age in seconds */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72)         int want_pages;        /* pages requested by system */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73)         int nr_negative;       /* # of unused negative dentries */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74)         int dummy;             /* Reserved for future use */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75)   };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) Dentries are dynamically allocated and deallocated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) nr_dentry shows the total number of dentries allocated (active
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) + unused). nr_unused shows the number of dentries that are not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) actively used, but are saved in the LRU list for future reuse.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) Age_limit is the age in seconds after which dcache entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) can be reclaimed when memory is short and want_pages is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) nonzero when shrink_dcache_pages() has been called and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) dcache isn't pruned yet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) nr_negative shows the number of unused dentries that are also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) negative dentries which do not map to any files. Instead,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) they help speeding up rejection of non-existing files provided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) by the users.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) dquot-max & dquot-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) The file dquot-max shows the maximum number of cached disk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) quota entries.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) The file dquot-nr shows the number of allocated disk quota
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) entries and the number of free disk quota entries.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) If the number of free cached disk quotas is very low and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) you have some awesome number of simultaneous system users,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) you might want to raise the limit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) file-max & file-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) ------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) The value in file-max denotes the maximum number of file-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) handles that the Linux kernel will allocate. When you get lots
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) of error messages about running out of file handles, you might
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) want to increase this limit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) Historically,the kernel was able to allocate file handles
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) dynamically, but not to free them again. The three values in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) file-nr denote the number of allocated file handles, the number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) of allocated but unused file handles, and the maximum number of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) file handles. Linux 2.6 always reports 0 as the number of free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) file handles -- this is not an error, it just means that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) number of allocated file handles exactly matches the number of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) used file handles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) Attempts to allocate more file descriptors than file-max are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) reported with printk, look for "VFS: file-max limit <number>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) reached".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) nr_open
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) This denotes the maximum number of file-handles a process can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) allocate. Default value is 1024*1024 (1048576) which should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) enough for most machines. Actual limit depends on RLIMIT_NOFILE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) resource limit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) inode-max, inode-nr & inode-state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) ---------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) As with file handles, the kernel allocates the inode structures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) dynamically, but can't free them yet.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) The value in inode-max denotes the maximum number of inode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) handlers. This value should be 3-4 times larger than the value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) in file-max, since stdin, stdout and network sockets also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) need an inode struct to handle them. When you regularly run
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) out of inodes, you need to increase this value.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) The file inode-nr contains the first two items from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) inode-state, so we'll skip to that file...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) Inode-state contains three actual numbers and four dummies.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) The actual numbers are, in order of appearance, nr_inodes,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) nr_free_inodes and preshrink.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) Nr_inodes stands for the number of inodes the system has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) allocated, this can be slightly more than inode-max because
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) Linux allocates them one pageful at a time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) Nr_free_inodes represents the number of free inodes (?) and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) preshrink is nonzero when the nr_inodes > inode-max and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) system needs to prune the inode list instead of allocating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) more.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) overflowgid & overflowuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) -------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) Some filesystems only support 16-bit UIDs and GIDs, although in Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) UIDs and GIDs are 32 bits. When one of these filesystems is mounted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) with writes enabled, any UID or GID that would exceed 65535 is translated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) to a fixed value before being written to disk.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) These sysctls allow you to change the value of the fixed UID and GID.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) The default is 65534.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) pipe-user-pages-hard
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) Maximum total number of pages a non-privileged user may allocate for pipes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) Once this limit is reached, no new pipes may be allocated until usage goes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) below the limit again. When set to 0, no limit is applied, which is the default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) setting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) pipe-user-pages-soft
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) Maximum total number of pages a non-privileged user may allocate for pipes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) before the pipe size gets limited to a single page. Once this limit is reached,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) new pipes will be limited to a single page in size for this user in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) limit total memory usage, and trying to increase them using fcntl() will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) denied until usage goes below the limit again. The default value allows to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) allocate up to 1024 pipes at their default size. When set to 0, no limit is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) applied.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) protected_fifos
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) The intent of this protection is to avoid unintentional writes to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) an attacker-controlled FIFO, where a program expected to create a regular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) When set to "0", writing to FIFOs is unrestricted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) When set to "1" don't allow O_CREAT open on FIFOs that we don't own
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) in world writable sticky directories, unless they are owned by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) owner of the directory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) When set to "2" it also applies to group writable sticky directories.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) This protection is based on the restrictions in Openwall.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) protected_hardlinks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) A long-standing class of security issues is the hardlink-based
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) time-of-check-time-of-use race, most commonly seen in world-writable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) directories like /tmp. The common method of exploitation of this flaw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) is to cross privilege boundaries when following a given hardlink (i.e. a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) root process follows a hardlink created by another user). Additionally,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) on systems without separated partitions, this stops unauthorized users
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) from "pinning" vulnerable setuid/setgid files against being upgraded by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) the administrator, or linking to special files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) When set to "0", hardlink creation behavior is unrestricted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) When set to "1" hardlinks cannot be created by users if they do not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) already own the source file, or do not have read/write access to it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) This protection is based on the restrictions in Openwall and grsecurity.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) protected_regular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) This protection is similar to protected_fifos, but it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) avoids writes to an attacker-controlled regular file, where a program
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) expected to create one.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) When set to "0", writing to regular files is unrestricted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) When set to "1" don't allow O_CREAT open on regular files that we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) don't own in world writable sticky directories, unless they are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) owned by the owner of the directory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) When set to "2" it also applies to group writable sticky directories.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) protected_symlinks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) ------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) A long-standing class of security issues is the symlink-based
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) time-of-check-time-of-use race, most commonly seen in world-writable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) directories like /tmp. The common method of exploitation of this flaw
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) is to cross privilege boundaries when following a given symlink (i.e. a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) root process follows a symlink belonging to another user). For a likely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) incomplete list of hundreds of examples across the years, please see:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) When set to "0", symlink following behavior is unrestricted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) When set to "1" symlinks are permitted to be followed only when outside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) a sticky world-writable directory, or when the uid of the symlink and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) follower match, or when the directory owner matches the symlink's owner.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) This protection is based on the restrictions in Openwall and grsecurity.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) suid_dumpable:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) This value can be used to query and set the core dump mode for setuid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) or otherwise protected/tainted binaries. The modes are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) =   ==========  ===============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) 0   (default)	traditional behaviour. Any process which has changed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) 		privilege levels or is execute only will not be dumped.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 1   (debug)	all processes dump core when possible. The core dump is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) 		owned by the current user and no security is applied. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) 		intended for system debugging situations only.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) 		Ptrace is unchecked.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 		This is insecure as it allows regular users to examine the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) 		memory contents of privileged processes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) 2   (suidsafe)	any binary which normally would not be dumped is dumped
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) 		anyway, but only if the "core_pattern" kernel sysctl is set to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) 		either a pipe handler or a fully qualified path. (For more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 		details on this limitation, see CVE-2006-2451.) This mode is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) 		appropriate when administrators are attempting to debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) 		problems in a normal environment, and either have a core dump
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 		pipe handler that knows to treat privileged core dumps with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) 		care, or specific directory defined for catching core dumps.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) 		If a core dump happens without a pipe handler or fully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) 		qualified path, a message will be emitted to syslog warning
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) 		about the lack of a correct setting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) =   ==========  ===============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) super-max & super-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) These numbers control the maximum number of superblocks, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) thus the maximum number of mounted filesystems the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) can have. You only need to increase super-max if you need to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) mount more filesystems than the current value in super-max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) allows you to.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) aio-nr & aio-max-nr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) aio-nr shows the current system-wide number of asynchronous io
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) requests.  aio-max-nr allows you to change the maximum value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) aio-nr can grow to.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) mount-max
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) This denotes the maximum number of mounts that may exist
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) in a mount namespace.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 2. /proc/sys/fs/binfmt_misc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) Documentation for the files in /proc/sys/fs/binfmt_misc is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) in Documentation/admin-guide/binfmt-misc.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) 3. /proc/sys/fs/mqueue - POSIX message queues filesystem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) ========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) The "mqueue"  filesystem provides  the necessary kernel features to enable the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) creation of a  user space  library that  implements  the  POSIX message queues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) API (as noted by the  MSG tag in the  POSIX 1003.1-2001 version  of the System
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) Interfaces specification.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) The "mqueue" filesystem contains values for determining/setting  the amount of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) resources used by the file system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) /proc/sys/fs/mqueue/queues_max is a read/write  file for  setting/getting  the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) maximum number of message queues allowed on the system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) /proc/sys/fs/mqueue/msg_max  is  a  read/write file  for  setting/getting  the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) maximum number of messages in a queue value.  In fact it is the limiting value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) for another (user) limit which is set in mq_open invocation. This attribute of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) a queue must be less or equal then msg_max.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) /proc/sys/fs/mqueue/msgsize_max is  a read/write  file for setting/getting the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) maximum  message size value (it is every  message queue's attribute set during
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) its creation).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) /proc/sys/fs/mqueue/msg_default is  a read/write  file for setting/getting the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) default number of messages in a queue value if attr parameter of mq_open(2) is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) NULL. If it exceed msg_max, the default value is initialized msg_max.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) /proc/sys/fs/mqueue/msgsize_default is a read/write file for setting/getting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) the default message size value if attr parameter of mq_open(2) is NULL. If it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) exceed msgsize_max, the default value is initialized msgsize_max.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) 4. /proc/sys/fs/epoll - Configuration options for the epoll interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) =====================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) This directory contains configuration options for the epoll(7) interface.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) max_user_watches
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) Every epoll file descriptor can store a number of files to be monitored
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) for event readiness. Each one of these monitored files constitutes a "watch".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) This configuration option sets the maximum number of "watches" that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) allowed for each user.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) Each "watch" costs roughly 90 bytes on a 32bit kernel, and roughly 160 bytes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) on a 64bit one.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) The current default value for  max_user_watches  is the 1/32 of the available
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) low memory, divided for the "watch" cost in bytes.