^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ===========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Fault injection capabilities infrastructure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) Available fault injection capabilities
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) --------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) - failslab
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) - fail_page_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) - fail_usercopy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) - fail_futex
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) injects futex deadlock and uaddr fault errors.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) - fail_make_request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) injects disk IO errors on devices permitted by setting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) /sys/block/<device>/make-it-fail or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) - fail_mmc_request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) injects MMC data errors on devices permitted by setting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) - fail_function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) injects error return on specific functions, which are marked by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) under /sys/kernel/debug/fail_function. No boot option supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) - NVMe fault injection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) inject NVMe status code and retry flag on devices permitted by setting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) status code is NVME_SC_INVALID_OPCODE with no retry. The status code and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) retry flag can be set via the debugfs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) Configure fault-injection capabilities behavior
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) -----------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) debugfs entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) ^^^^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) fault-inject-debugfs kernel module provides some debugfs entries for runtime
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) configuration of fault-injection capabilities.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) - /sys/kernel/debug/fail*/probability:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) likelihood of failure injection, in percent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) Format: <percent>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) Note that one-failure-per-hundred is a very high error rate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) for some testcases. Consider setting probability=100 and configure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) /sys/kernel/debug/fail*/interval for such testcases.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) - /sys/kernel/debug/fail*/interval:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) specifies the interval between failures, for calls to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) should_fail() that pass all the other tests.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) Note that if you enable this, by setting interval>1, you will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) probably want to set probability=100.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) - /sys/kernel/debug/fail*/times:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) specifies how many times failures may happen at most.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) A value of -1 means "no limit".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) - /sys/kernel/debug/fail*/space:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) specifies an initial resource "budget", decremented by "size"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) on each call to should_fail(,size). Failure injection is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) suppressed until "space" reaches zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) - /sys/kernel/debug/fail*/verbose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) Format: { 0 | 1 | 2 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) specifies the verbosity of the messages when failure is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) injected. '0' means no messages; '1' will print only a single
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) log line per failure; '2' will print a call trace too -- useful
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) to debug the problems revealed by fault injection.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) - /sys/kernel/debug/fail*/task-filter:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) Format: { 'Y' | 'N' }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) A value of 'N' disables filtering by process (default).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) Any positive value limits failures to only processes indicated by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) /proc/<pid>/make-it-fail==1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) - /sys/kernel/debug/fail*/require-start,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) /sys/kernel/debug/fail*/require-end,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) /sys/kernel/debug/fail*/reject-start,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) /sys/kernel/debug/fail*/reject-end:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) specifies the range of virtual addresses tested during
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) stacktrace walking. Failure is injected only if some caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) in the walked stacktrace lies within the required range, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) none lies within the rejected range.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) Default required range is [0,ULONG_MAX) (whole of virtual address space).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) Default rejected range is [0,0).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) - /sys/kernel/debug/fail*/stacktrace-depth:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) specifies the maximum stacktrace depth walked during search
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) for a caller within [require-start,require-end) OR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) [reject-start,reject-end).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) - /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) Format: { 'Y' | 'N' }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) default is 'N', setting it to 'Y' won't inject failures into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) highmem/user allocations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) - /sys/kernel/debug/failslab/ignore-gfp-wait:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) - /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) Format: { 'Y' | 'N' }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) default is 'N', setting it to 'Y' will inject failures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) only into non-sleep allocations (GFP_ATOMIC allocations).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) - /sys/kernel/debug/fail_page_alloc/min-order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) specifies the minimum page allocation order to be injected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) failures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) - /sys/kernel/debug/fail_futex/ignore-private:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) Format: { 'Y' | 'N' }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) default is 'N', setting it to 'Y' will disable failure injections
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) when dealing with private (address space) futexes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) - /sys/kernel/debug/fail_function/inject:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) Format: { 'function-name' | '!function-name' | '' }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) specifies the target function of error injection by name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) If the function name leads '!' prefix, given function is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) removed from injection list. If nothing specified ('')
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) injection list is cleared.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) - /sys/kernel/debug/fail_function/injectable:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) (read only) shows error injectable functions and what type of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) error values can be specified. The error type will be one of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) below;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) - NULL: retval must be 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) - /sys/kernel/debug/fail_function/<functiuon-name>/retval:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) specifies the "error" return value to inject to the given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) function for given function. This will be created when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) user specifies new injection entry.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) Boot option
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) ^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) In order to inject faults while debugfs is not available (early boot time),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) use the boot option::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) failslab=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) fail_page_alloc=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) fail_usercopy=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) fail_make_request=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) fail_futex=
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) mmc_core.fail_request=<interval>,<probability>,<space>,<times>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) proc entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) ^^^^^^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) - /proc/<pid>/fail-nth,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) /proc/self/task/<tid>/fail-nth:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) Write to this file of integer N makes N-th call in the task fail.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) Read from this file returns a integer value. A value of '0' indicates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) that the fault setup with a previous write to this file was injected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) A positive integer N indicates that the fault wasn't yet injected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) Note that this file enables all types of faults (slab, futex, etc).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) This setting takes precedence over all other generic debugfs settings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) like probability, interval, times, etc. But per-capability settings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) (e.g. fail_futex/ignore-private) take precedence over it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) This feature is intended for systematic testing of faults in a single
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) system call. See an example below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) How to add new fault injection capability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) -----------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) - #include <linux/fault-inject.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) - define the fault attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) DECLARE_FAULT_ATTR(name);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) Please see the definition of struct fault_attr in fault-inject.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) for details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) - provide a way to configure fault attributes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) - boot option
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) If you need to enable the fault injection capability from boot time, you can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) provide boot option to configure it. There is a helper function for it:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) setup_fault_attr(attr, str);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) - debugfs entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) Helper functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) fault_create_debugfs_attr(name, parent, attr);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) - module parameters
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) If the scope of the fault injection capability is limited to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) single kernel module, it is better to provide module parameters to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) configure the fault attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) - add a hook to insert failures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) Upon should_fail() returning true, client code should inject a failure:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) should_fail(attr, size);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) Application Examples
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) - Inject slab allocation failures into module init/exit code::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) #!/bin/bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) FAILTYPE=failslab
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) echo 10 > /sys/kernel/debug/$FAILTYPE/probability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) echo 100 > /sys/kernel/debug/$FAILTYPE/interval
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) echo -1 > /sys/kernel/debug/$FAILTYPE/times
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) echo 0 > /sys/kernel/debug/$FAILTYPE/space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) faulty_system()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) if [ $# -eq 0 ]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) echo "Usage: $0 modulename [ modulename ... ]"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) exit 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) fi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) for m in $*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) echo inserting $m...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) faulty_system modprobe $m
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) echo removing $m...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) faulty_system modprobe -r $m
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) done
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) ------------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) - Inject page allocation failures only for a specific module::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) #!/bin/bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) FAILTYPE=fail_page_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) module=$1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) if [ -z $module ]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) echo "Usage: $0 <modulename>"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) exit 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) fi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) modprobe $module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) if [ ! -d /sys/module/$module/sections ]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) echo Module $module is not loaded
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) exit 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) fi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) echo N > /sys/kernel/debug/$FAILTYPE/task-filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) echo 10 > /sys/kernel/debug/$FAILTYPE/probability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) echo 100 > /sys/kernel/debug/$FAILTYPE/interval
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) echo -1 > /sys/kernel/debug/$FAILTYPE/times
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) echo 0 > /sys/kernel/debug/$FAILTYPE/space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) echo "Injecting errors into the module $module... (interrupt to stop)"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) sleep 1000000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) ------------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) - Inject open_ctree error while btrfs mount::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) #!/bin/bash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) rm -f testfile.img
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) DEVICE=$(losetup --show -f testfile.img)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) mkfs.btrfs -f $DEVICE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) mkdir -p tmpmnt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) FAILTYPE=fail_function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) FAILFUNC=open_ctree
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) echo N > /sys/kernel/debug/$FAILTYPE/task-filter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) echo 100 > /sys/kernel/debug/$FAILTYPE/probability
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) echo 0 > /sys/kernel/debug/$FAILTYPE/interval
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) echo -1 > /sys/kernel/debug/$FAILTYPE/times
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) echo 0 > /sys/kernel/debug/$FAILTYPE/space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) mount -t btrfs $DEVICE tmpmnt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) if [ $? -ne 0 ]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) echo "SUCCESS!"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) else
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) echo "FAILED!"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) umount tmpmnt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) fi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) echo > /sys/kernel/debug/$FAILTYPE/inject
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) rmdir tmpmnt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) losetup -d $DEVICE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) rm testfile.img
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) Tool to run command with failslab or fail_page_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) ----------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) In order to make it easier to accomplish the tasks mentioned above, we can use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) tools/testing/fault-injection/failcmd.sh. Please run a command
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) "./tools/testing/fault-injection/failcmd.sh --help" for more information and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) see the following examples.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) Examples:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) allocation failure::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) # ./tools/testing/fault-injection/failcmd.sh \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) -- make -C tools/testing/selftests/ run_tests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) Same as above except to specify 100 times failures at most instead of one time
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) at most by default::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) # ./tools/testing/fault-injection/failcmd.sh --times=100 \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) -- make -C tools/testing/selftests/ run_tests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) Same as above except to inject page allocation failure instead of slab
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) allocation failure::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) # env FAILCMD_TYPE=fail_page_alloc \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) ./tools/testing/fault-injection/failcmd.sh --times=100 \
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) -- make -C tools/testing/selftests/ run_tests
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) Systematic faults using fail-nth
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) ---------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) The following code systematically faults 0-th, 1-st, 2-nd and so on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) capabilities in the socketpair() system call::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) #include <sys/types.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) #include <sys/stat.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) #include <sys/socket.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) #include <sys/syscall.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) #include <fcntl.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) #include <unistd.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) #include <string.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) #include <stdlib.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) #include <stdio.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) #include <errno.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) int main()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) int i, err, res, fail_nth, fds[2];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) char buf[128];
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) fail_nth = open(buf, O_RDWR);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) for (i = 1;; i++) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) sprintf(buf, "%d", i);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) write(fail_nth, buf, strlen(buf));
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) err = errno;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) pread(fail_nth, buf, sizeof(buf), 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) if (res == 0) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) close(fds[0]);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) close(fds[1]);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) res, err);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) if (atoi(buf))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) break;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) return 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) An example output::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) 1-th fault Y: res=-1/23
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) 2-th fault Y: res=-1/23
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) 3-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) 4-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) 5-th fault Y: res=-1/23
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) 6-th fault Y: res=-1/23
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) 7-th fault Y: res=-1/23
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) 8-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) 9-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) 10-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) 11-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) 12-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) 13-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) 14-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) 15-th fault Y: res=-1/12
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) 16-th fault N: res=0/12