^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) OCFS2 file system - online file check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) =====================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) This document will describe OCFS2 online file check feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) OCFS2 is often used in high-availability systems. However, OCFS2 usually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) converts the filesystem to read-only when encounters an error. This may not be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) necessary, since turning the filesystem read-only would affect other running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) processes as well, decreasing availability.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) Then, a mount option (errors=continue) is introduced, which would return the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) -EIO errno to the calling process and terminate further processing so that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) filesystem is not corrupted further. The filesystem is not converted to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) read-only, and the problematic file's inode number is reported in the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) log. The user can try to check/fix this file via online filecheck feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) Scope
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) This effort is to check/fix small issues which may hinder day-to-day operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) of a cluster filesystem by turning the filesystem read-only. The scope of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) checking/fixing is at the file level, initially for regular files and eventually
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) to all files (including system files) of the filesystem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) In case of directory to file links is incorrect, the directory inode is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) reported as erroneous.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) This feature is not suited for extravagant checks which involve dependency of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) other components of the filesystem, such as but not limited to, checking if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) bits for file blocks in the allocation has been set. In case of such an error,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) the offline fsck should/would be recommended.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) Finally, such an operation/feature should not be automated lest the filesystem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) may end up with more damage than before the repair attempt. So, this has to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) be performed using user interaction and consent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) User interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) When there are errors in the OCFS2 filesystem, they are usually accompanied
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) by the inode number which caused the error. This inode number would be the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) input to check/fix the file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) There is a sysfs directory for each OCFS2 file system mounting::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) /sys/fs/ocfs2/<devname>/filecheck
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) Here, <devname> indicates the name of OCFS2 volume device which has been already
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) mounted. The file above would accept inode numbers. This could be used to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) communicate with kernel space, tell which file(inode number) will be checked or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) fixed. Currently, three operations are supported, which includes checking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) inode, fixing inode and setting the size of result record history.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) 1. If you want to know what error exactly happened to <inode> before fixing, do::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) # echo "<inode>" > /sys/fs/ocfs2/<devname>/filecheck/check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) # cat /sys/fs/ocfs2/<devname>/filecheck/check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) The output is like this::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) INO DONE ERROR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) 39502 1 GENERATION
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) <INO> lists the inode numbers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) <DONE> indicates whether the operation has been finished.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) <ERROR> says what kind of errors was found. For the detailed error numbers,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) please refer to the file linux/fs/ocfs2/filecheck.h.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) 2. If you determine to fix this inode, do::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) # echo "<inode>" > /sys/fs/ocfs2/<devname>/filecheck/fix
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) # cat /sys/fs/ocfs2/<devname>/filecheck/fix
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) The output is like this:::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) INO DONE ERROR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) 39502 1 SUCCESS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) This time, the <ERROR> column indicates whether this fix is successful or not.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) 3. The record cache is used to store the history of check/fix results. It's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) default size is 10, and can be adjust between the range of 10 ~ 100. You can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) adjust the size like this::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) # echo "<size>" > /sys/fs/ocfs2/<devname>/filecheck/set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) Fixing stuff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) On receiving the inode, the filesystem would read the inode and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) file metadata. In case of errors, the filesystem would fix the errors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) and report the problems it fixed in the kernel log. As a precautionary measure,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) the inode must first be checked for errors before performing a final fix.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) The inode and the result history will be maintained temporarily in a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) small linked list buffer which would contain the last (N) inodes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) fixed/checked, the detailed errors which were fixed/checked are printed in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) kernel log.