^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ==========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) WHAT IS Flash-Friendly File System (F2FS)?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ==========================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) NAND flash memory-based storage devices, such as SSD, eMMC, and SD cards, have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) been equipped on a variety systems ranging from mobile to server systems. Since
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) they are known to have different characteristics from the conventional rotating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) disks, a file system, an upper layer to the storage device, should adapt to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) changes from the sketch in the design level.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) F2FS is a file system exploiting NAND flash memory-based storage devices, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) is based on Log-structured File System (LFS). The design has been focused on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) addressing the fundamental issues in LFS, which are snowball effect of wandering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) tree and high cleaning overhead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) Since a NAND flash memory-based storage device shows different characteristic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) according to its internal geometry or flash memory management scheme, namely FTL,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) F2FS and its tools support various parameters not only for configuring on-disk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) layout, but also for selecting allocation and cleaning algorithms.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) The following git tree provides the file system formatting tool (mkfs.f2fs),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) a consistency checking tool (fsck.f2fs), and a debugging tool (dump.f2fs).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) - git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) For reporting bugs and sending patches, please use the following mailing list:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) - linux-f2fs-devel@lists.sourceforge.net
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) Background and Design issues
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) Log-structured File System (LFS)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) --------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) "A log-structured file system writes all modifications to disk sequentially in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) a log-like structure, thereby speeding up both file writing and crash recovery.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) The log is the only structure on disk; it contains indexing information so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) files can be read back from the log efficiently. In order to maintain large free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) areas on disk for fast writing, we divide the log into segments and use a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) segment cleaner to compress the live information from heavily fragmented
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) segments." from Rosenblum, M. and Ousterhout, J. K., 1992, "The design and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) implementation of a log-structured file system", ACM Trans. Computer Systems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) 10, 1, 26–52.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) Wandering Tree Problem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) In LFS, when a file data is updated and written to the end of log, its direct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) pointer block is updated due to the changed location. Then the indirect pointer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) block is also updated due to the direct pointer block update. In this manner,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) the upper index structures such as inode, inode map, and checkpoint block are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) also updated recursively. This problem is called as wandering tree problem [1],
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) and in order to enhance the performance, it should eliminate or relax the update
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) propagation as much as possible.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) [1] Bityutskiy, A. 2005. JFFS3 design issues. http://www.linux-mtd.infradead.org/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) Cleaning Overhead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) Since LFS is based on out-of-place writes, it produces so many obsolete blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) scattered across the whole storage. In order to serve new empty log space, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) needs to reclaim these obsolete blocks seamlessly to users. This job is called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) as a cleaning process.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) The process consists of three operations as follows.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) 1. A victim segment is selected through referencing segment usage table.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) 2. It loads parent index structures of all the data in the victim identified by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) segment summary blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) 3. It checks the cross-reference between the data and its parent index structure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) 4. It moves valid data selectively.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) This cleaning job may cause unexpected long delays, so the most important goal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) is to hide the latencies to users. And also definitely, it should reduce the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) amount of valid data to be moved, and move them quickly as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) Key Features
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) Flash Awareness
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) - Enlarge the random write area for better performance, but provide the high
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) spatial locality
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) - Align FS data structures to the operational units in FTL as best efforts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) Wandering Tree Problem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) - Use a term, “node”, that represents inodes as well as various pointer blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) - Introduce Node Address Table (NAT) containing the locations of all the “node”
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) blocks; this will cut off the update propagation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) Cleaning Overhead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) - Support a background cleaning process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) - Support greedy and cost-benefit algorithms for victim selection policies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) - Support multi-head logs for static/dynamic hot and cold data separation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) - Introduce adaptive logging for efficient block allocation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) Mount Options
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) =============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) ======================== ============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) background_gc=%s Turn on/off cleaning operations, namely garbage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) collection, triggered in background when I/O subsystem is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) idle. If background_gc=on, it will turn on the garbage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) collection and if background_gc=off, garbage collection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) will be turned off. If background_gc=sync, it will turn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) on synchronous garbage collection running in background.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) Default value for this option is on. So garbage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) collection is on by default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) gc_merge When background_gc is on, this option can be enabled to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) let background GC thread to handle foreground GC requests,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) it can eliminate the sluggish issue caused by slow foreground
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) GC operation when GC is triggered from a process with limited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) I/O and CPU resources.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) nogc_merge Disable GC merge feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) disable_roll_forward Disable the roll-forward recovery routine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) norecovery Disable the roll-forward recovery routine, mounted read-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) only (i.e., -o ro,disable_roll_forward)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) discard/nodiscard Enable/disable real-time discard in f2fs, if discard is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) enabled, f2fs will issue discard/TRIM commands when a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) segment is cleaned.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) no_heap Disable heap-style segment allocation which finds free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) segments for data from the beginning of main area, while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) for node from the end of main area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) nouser_xattr Disable Extended User Attributes. Note: xattr is enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) by default if CONFIG_F2FS_FS_XATTR is selected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) noacl Disable POSIX Access Control List. Note: acl is enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) by default if CONFIG_F2FS_FS_POSIX_ACL is selected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) active_logs=%u Support configuring the number of active logs. In the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) current design, f2fs supports only 2, 4, and 6 logs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) Default number is 6.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) disable_ext_identify Disable the extension list configured by mkfs, so f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) is not aware of cold files such as media files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) inline_xattr Enable the inline xattrs feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) noinline_xattr Disable the inline xattrs feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) inline_xattr_size=%u Support configuring inline xattr size, it depends on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) flexible inline xattr feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) inline_data Enable the inline data feature: Newly created small (<~3.4k)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) files can be written into inode block.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) inline_dentry Enable the inline dir feature: data in newly created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) directory entries can be written into inode block. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) space of inode block which is used to store inline
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) dentries is limited to ~3.4k.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) noinline_dentry Disable the inline dentry feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) flush_merge Merge concurrent cache_flush commands as much as possible
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) to eliminate redundant command issues. If the underlying
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) device handles the cache_flush command relatively slowly,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) recommend to enable this option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) nobarrier This option can be used if underlying storage guarantees
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) its cached data should be written to the novolatile area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) If this option is set, no cache_flush commands are issued
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) but f2fs still guarantees the write ordering of all the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) data writes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) fastboot This option is used when a system wants to reduce mount
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) time as much as possible, even though normal performance
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) can be sacrificed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) extent_cache Enable an extent cache based on rb-tree, it can cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) as many as extent which map between contiguous logical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) address and physical address per inode, resulting in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) increasing the cache hit ratio. Set by default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) noextent_cache Disable an extent cache based on rb-tree explicitly, see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) the above extent_cache mount option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) noinline_data Disable the inline data feature, inline data feature is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) enabled by default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) data_flush Enable data flushing before checkpoint in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) persist data of regular and symlink.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) reserve_root=%d Support configuring reserved space which is used for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) allocation from a privileged user with specified uid or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) gid, unit: 4KB, the default limit is 0.2% of user blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) resuid=%d The user ID which may use the reserved blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) resgid=%d The group ID which may use the reserved blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) fault_injection=%d Enable fault injection in all supported types with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) specified injection rate.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) fault_type=%d Support configuring fault injection type, should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) enabled with fault_injection option, fault type value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) is shown below, it supports single or combined type.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) =================== ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) Type_Name Type_Value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) =================== ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) FAULT_KMALLOC 0x000000001
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) FAULT_KVMALLOC 0x000000002
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) FAULT_PAGE_ALLOC 0x000000004
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) FAULT_PAGE_GET 0x000000008
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) FAULT_ALLOC_NID 0x000000020
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) FAULT_ORPHAN 0x000000040
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) FAULT_BLOCK 0x000000080
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) FAULT_DIR_DEPTH 0x000000100
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) FAULT_EVICT_INODE 0x000000200
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) FAULT_TRUNCATE 0x000000400
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) FAULT_READ_IO 0x000000800
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) FAULT_CHECKPOINT 0x000001000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) FAULT_DISCARD 0x000002000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) FAULT_WRITE_IO 0x000004000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) =================== ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) mode=%s Control block allocation mode which supports "adaptive"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) and "lfs". In "lfs" mode, there should be no random
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) writes towards main area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) io_bits=%u Set the bit size of write IO requests. It should be set
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) with "mode=lfs".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) usrquota Enable plain user disk quota accounting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) grpquota Enable plain group disk quota accounting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) prjquota Enable plain project quota accounting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) usrjquota=<file> Appoint specified file and type during mount, so that quota
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) grpjquota=<file> information can be properly updated during recovery flow,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) prjjquota=<file> <quota file>: must be in root directory;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) jqfmt=<quota type> <quota type>: [vfsold,vfsv0,vfsv1].
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) offusrjquota Turn off user journalled quota.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) offgrpjquota Turn off group journalled quota.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) offprjjquota Turn off project journalled quota.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) quota Enable plain user disk quota accounting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) noquota Disable all plain disk quota option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) whint_mode=%s Control which write hints are passed down to block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) layer. This supports "off", "user-based", and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) "fs-based". In "off" mode (default), f2fs does not pass
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) down hints. In "user-based" mode, f2fs tries to pass
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) down hints given by users. And in "fs-based" mode, f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) passes down hints with its policy.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) alloc_mode=%s Adjust block allocation policy, which supports "reuse"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) and "default".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) fsync_mode=%s Control the policy of fsync. Currently supports "posix",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) "strict", and "nobarrier". In "posix" mode, which is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) default, fsync will follow POSIX semantics and does a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) light operation to improve the filesystem performance.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) In "strict" mode, fsync will be heavy and behaves in line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) with xfs, ext4 and btrfs, where xfstest generic/342 will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) pass, but the performance will regress. "nobarrier" is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) based on "posix", but doesn't issue flush command for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) non-atomic files likewise "nobarrier" mount option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) test_dummy_encryption
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) test_dummy_encryption=%s
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) Enable dummy encryption, which provides a fake fscrypt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) context. The fake fscrypt context is used by xfstests.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) The argument may be either "v1" or "v2", in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) select the corresponding fscrypt policy version.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) checkpoint=%s[:%u[%]] Set to "disable" to turn off checkpointing. Set to "enable"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) to reenable checkpointing. Is enabled by default. While
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) disabled, any unmounting or unexpected shutdowns will cause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) the filesystem contents to appear as they did when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) filesystem was mounted with that option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) While mounting with checkpoint=disabled, the filesystem must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) run garbage collection to ensure that all available space can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) be used. If this takes too much time, the mount may return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) EAGAIN. You may optionally add a value to indicate how much
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) of the disk you would be willing to temporarily give up to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) avoid additional garbage collection. This can be given as a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) number of blocks, or as a percent. For instance, mounting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) with checkpoint=disable:100% would always succeed, but it may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) hide up to all remaining free space. The actual space that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) would be unusable can be viewed at /sys/fs/f2fs/<disk>/unusable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) This space is reclaimed once checkpoint=enable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) checkpoint_merge When checkpoint is enabled, this can be used to create a kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) daemon and make it to merge concurrent checkpoint requests as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) much as possible to eliminate redundant checkpoint issues. Plus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) we can eliminate the sluggish issue caused by slow checkpoint
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) operation when the checkpoint is done in a process context in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) a cgroup having low i/o budget and cpu shares. To make this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) do better, we set the default i/o priority of the kernel daemon
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) to "3", to give one higher priority than other kernel threads.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) This is the same way to give a I/O priority to the jbd2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) journaling thread of ext4 filesystem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) nocheckpoint_merge Disable checkpoint merge feature.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) compress_algorithm=%s Control compress algorithm, currently f2fs supports "lzo",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) "lz4", "zstd" and "lzo-rle" algorithm.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) compress_algorithm=%s:%d Control compress algorithm and its compress level, now, only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) "lz4" and "zstd" support compress level config.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) algorithm level range
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) lz4 3 - 16
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) zstd 1 - 22
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) compress_log_size=%u Support configuring compress cluster size, the size will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) be 4KB * (1 << %u), 16KB is minimum size, also it's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) default size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) compress_extension=%s Support adding specified extension, so that f2fs can enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) compression on those corresponding files, e.g. if all files
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) with '.ext' has high compression rate, we can set the '.ext'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) on compression extension list and enable compression on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) these file by default rather than to enable it via ioctl.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) For other files, we can still enable compression via ioctl.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) Note that, there is one reserved special extension '*', it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) can be set to enable compression for all files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) compress_chksum Support verifying chksum of raw data in compressed cluster.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) compress_mode=%s Control file compression mode. This supports "fs" and "user"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) modes. In "fs" mode (default), f2fs does automatic compression
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) on the compression enabled files. In "user" mode, f2fs disables
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) the automaic compression and gives the user discretion of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) choosing the target file and the timing. The user can do manual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) compression/decompression on the compression enabled files using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) ioctls.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) compress_cache Support to use address space of a filesystem managed inode to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) cache compressed block, in order to improve cache hit ratio of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) random read.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) inlinecrypt When possible, encrypt/decrypt the contents of encrypted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) files using the blk-crypto framework rather than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) filesystem-layer encryption. This allows the use of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) inline encryption hardware. The on-disk format is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) unaffected. For more details, see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) Documentation/block/inline-encryption.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) atgc Enable age-threshold garbage collection, it provides high
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) effectiveness and efficiency on background GC.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) ======================== ============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) Debugfs Entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) /sys/kernel/debug/f2fs/ contains information about all the partitions mounted as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) f2fs. Each file shows the whole f2fs information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) /sys/kernel/debug/f2fs/status includes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) - major file system information managed by f2fs currently
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) - average SIT information about whole segments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) - current memory footprint consumed by f2fs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) Sysfs Entries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) =============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) Information about mounted f2fs file systems can be found in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) /sys/fs/f2fs. Each mounted filesystem will have a directory in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) /sys/fs/f2fs based on its device name (i.e., /sys/fs/f2fs/sda).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) The files in each per-device directory are shown in table below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) Files in /sys/fs/f2fs/<devname>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) (see also Documentation/ABI/testing/sysfs-fs-f2fs)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) Usage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 1. Download userland tools and compile them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) 2. Skip, if f2fs was compiled statically inside kernel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) Otherwise, insert the f2fs.ko module::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) # insmod f2fs.ko
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) 3. Create a directory to use when mounting::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) # mkdir /mnt/f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) 4. Format the block device, and then mount as f2fs::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) # mkfs.f2fs -l label /dev/block_device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) # mount -t f2fs /dev/block_device /mnt/f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) mkfs.f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) The mkfs.f2fs is for the use of formatting a partition as the f2fs filesystem,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) which builds a basic on-disk layout.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) The quick options consist of:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) =============== ===========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) ``-l [label]`` Give a volume label, up to 512 unicode name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) ``-a [0 or 1]`` Split start location of each area for heap-based allocation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) 1 is set by default, which performs this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) ``-o [int]`` Set overprovision ratio in percent over volume size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 5 is set by default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) ``-s [int]`` Set the number of segments per section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 1 is set by default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) ``-z [int]`` Set the number of sections per zone.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) 1 is set by default.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) ``-e [str]`` Set basic extension list. e.g. "mp3,gif,mov"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) ``-t [0 or 1]`` Disable discard command or not.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) 1 is set by default, which conducts discard.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) =============== ===========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) Note: please refer to the manpage of mkfs.f2fs(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) fsck.f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) The fsck.f2fs is a tool to check the consistency of an f2fs-formatted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) partition, which examines whether the filesystem metadata and user-made data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) are cross-referenced correctly or not.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) Note that, initial version of the tool does not fix any inconsistency.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) The quick options consist of::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) -d debug level [default:0]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) Note: please refer to the manpage of fsck.f2fs(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) dump.f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) The dump.f2fs shows the information of specific inode and dumps SSA and SIT to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) file. Each file is dump_ssa and dump_sit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) The dump.f2fs is used to debug on-disk data structures of the f2fs filesystem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) It shows on-disk inode information recognized by a given inode number, and is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) able to dump all the SSA and SIT entries into predefined files, ./dump_ssa and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) ./dump_sit respectively.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) The options consist of::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) -d debug level [default:0]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) -i inode no (hex)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) -s [SIT dump segno from #1~#2 (decimal), for all 0~-1]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) -a [SSA dump segno from #1~#2 (decimal), for all 0~-1]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) Examples::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) # dump.f2fs -i [ino] /dev/sdx
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) # dump.f2fs -s 0~-1 /dev/sdx (SIT dump)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) # dump.f2fs -a 0~-1 /dev/sdx (SSA dump)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) Note: please refer to the manpage of dump.f2fs(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) sload.f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) ----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) The sload.f2fs gives a way to insert files and directories in the exisiting disk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) image. This tool is useful when building f2fs images given compiled files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) Note: please refer to the manpage of sload.f2fs(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) resize.f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) The resize.f2fs lets a user resize the f2fs-formatted disk image, while preserving
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) all the files and directories stored in the image.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) Note: please refer to the manpage of resize.f2fs(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) defrag.f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) The defrag.f2fs can be used to defragment scattered written data as well as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) filesystem metadata across the disk. This can improve the write speed by giving
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) more free consecutive space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) Note: please refer to the manpage of defrag.f2fs(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) f2fs_io
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) -------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) The f2fs_io is a simple tool to issue various filesystem APIs as well as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) f2fs-specific ones, which is very useful for QA tests.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) Note: please refer to the manpage of f2fs_io(8) to get full option list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) Design
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) ======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) On-disk Layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) F2FS divides the whole volume into a number of segments, each of which is fixed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) to 2MB in size. A section is composed of consecutive segments, and a zone
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) consists of a set of sections. By default, section and zone sizes are set to one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) segment size identically, but users can easily modify the sizes by mkfs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) F2FS splits the entire volume into six areas, and all the areas except superblock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) consist of multiple segments as described below::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) align with the zone size <-|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) |-> align with the segment size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) _________________________________________________________________________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) | | | Segment | Node | Segment | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) | Superblock | Checkpoint | Info. | Address | Summary | Main |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) | (SB) | (CP) | Table (SIT) | Table (NAT) | Area (SSA) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) |____________|_____2______|______N______|______N______|______N_____|__N___|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) ._________________________________________.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) |_Segment_|_..._|_Segment_|_..._|_Segment_|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) ._________._________
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471) |_section_|__...__|_
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) .________.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474) |__zone__|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) - Superblock (SB)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) It is located at the beginning of the partition, and there exist two copies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) to avoid file system crash. It contains basic partition information and some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) default parameters of f2fs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481) - Checkpoint (CP)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) It contains file system information, bitmaps for valid NAT/SIT sets, orphan
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) inode lists, and summary entries of current active segments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485) - Segment Information Table (SIT)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) It contains segment information such as valid block count and bitmap for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) validity of all the blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) - Node Address Table (NAT)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490) It is composed of a block address table for all the node blocks stored in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) Main area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493) - Segment Summary Area (SSA)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) It contains summary entries which contains the owner information of all the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) data and node blocks stored in Main area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) - Main Area
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498) It contains file and directory data including their indices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500) In order to avoid misalignment between file system and flash-based storage, F2FS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) aligns the start block address of CP with the segment size. Also, it aligns the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502) start block address of Main area with the zone size by reserving some segments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) in SSA area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) Reference the following survey for additional technical details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) https://wiki.linaro.org/WorkingGroups/Kernel/Projects/FlashCardSurvey
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) File System Metadata Structure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509) ------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) F2FS adopts the checkpointing scheme to maintain file system consistency. At
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) mount time, F2FS first tries to find the last valid checkpoint data by scanning
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) CP area. In order to reduce the scanning time, F2FS uses only two copies of CP.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) One of them always indicates the last valid data, which is called as shadow copy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) mechanism. In addition to CP, NAT and SIT also adopt the shadow copy mechanism.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) For file system consistency, each CP points to which NAT and SIT copies are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518) valid, as shown as below::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) +--------+----------+---------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) | CP | SIT | NAT |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) +--------+----------+---------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525) . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526) +-------+-------+--------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) | CP #0 | CP #1 | SIT #0 | SIT #1 | NAT #0 | NAT #1 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) +-------+-------+--------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529) | ^ ^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) | | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531) `----------------------------------------'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533) Index Structure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) The key data structure to manage the data locations is a "node". Similar to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537) traditional file structures, F2FS has three types of node: inode, direct node,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538) indirect node. F2FS assigns 4KB to an inode block which contains 923 data block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539) indices, two direct node pointers, two indirect node pointers, and one double
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) indirect node pointer as described below. One direct node block contains 1018
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541) data blocks, and one indirect node block contains also 1018 node blocks. Thus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) one inode block (i.e., a file) covers::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) 4KB * (923 + 2 * 1018 + 2 * 1018 * 1018 + 1018 * 1018 * 1018) := 3.94TB.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546) Inode block (4KB)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) |- data (923)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548) |- direct node (2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549) | `- data (1018)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550) |- indirect node (2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551) | `- direct node (1018)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552) | `- data (1018)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) `- double indirect node (1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554) `- indirect node (1018)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555) `- direct node (1018)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556) `- data (1018)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) Note that all the node blocks are mapped by NAT which means the location of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559) each node is translated by the NAT table. In the consideration of the wandering
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) tree problem, F2FS is able to cut off the propagation of node updates caused by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561) leaf data writes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563) Directory Structure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 565)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 566) A directory entry occupies 11 bytes, which consists of the following attributes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 567)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 568) - hash hash value of the file name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 569) - ino inode number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 570) - len the length of file name
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 571) - type file type such as directory, symlink, etc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 572)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 573) A dentry block consists of 214 dentry slots and file names. Therein a bitmap is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 574) used to represent whether each dentry is valid or not. A dentry block occupies
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 575) 4KB with the following composition.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 576)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 577) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 578)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 579) Dentry Block(4 K) = bitmap (27 bytes) + reserved (3 bytes) +
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 580) dentries(11 * 214 bytes) + file name (8 * 214 bytes)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 581)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 582) [Bucket]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 583) +--------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 584) |dentry block 1 | dentry block 2 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 585) +--------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 586) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 587) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 588) . [Dentry Block Structure: 4KB] .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 589) +--------+----------+----------+------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 590) | bitmap | reserved | dentries | file names |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 591) +--------+----------+----------+------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 592) [Dentry Block: 4KB] . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 593) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 594) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 595) +------+------+-----+------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 596) | hash | ino | len | type |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 597) +------+------+-----+------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 598) [Dentry Structure: 11 bytes]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 599)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 600) F2FS implements multi-level hash tables for directory structure. Each level has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 601) a hash table with dedicated number of hash buckets as shown below. Note that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 602) "A(2B)" means a bucket includes 2 data blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 603)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 604) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 605)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 606) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 607) A : bucket
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 608) B : block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 609) N : MAX_DIR_HASH_DEPTH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 610) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 611)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 612) level #0 | A(2B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 613) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 614) level #1 | A(2B) - A(2B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 615) |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 616) level #2 | A(2B) - A(2B) - A(2B) - A(2B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 617) . | . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 618) level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 619) . | . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 620) level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 621)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 622) The number of blocks and buckets are determined by::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 623)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 624) ,- 2, if n < MAX_DIR_HASH_DEPTH / 2,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 625) # of blocks in level #n = |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 626) `- 4, Otherwise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 627)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 628) ,- 2^(n + dir_level),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 629) | if n + dir_level < MAX_DIR_HASH_DEPTH / 2,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 630) # of buckets in level #n = |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 631) `- 2^((MAX_DIR_HASH_DEPTH / 2) - 1),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 632) Otherwise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 633)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 634) When F2FS finds a file name in a directory, at first a hash value of the file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 635) name is calculated. Then, F2FS scans the hash table in level #0 to find the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 636) dentry consisting of the file name and its inode number. If not found, F2FS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 637) scans the next hash table in level #1. In this way, F2FS scans hash tables in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 638) each levels incrementally from 1 to N. In each level F2FS needs to scan only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 639) one bucket determined by the following equation, which shows O(log(# of files))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 640) complexity::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 641)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 642) bucket number to scan in level #n = (hash value) % (# of buckets in level #n)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 643)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 644) In the case of file creation, F2FS finds empty consecutive slots that cover the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 645) file name. F2FS searches the empty slots in the hash tables of whole levels from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 646) 1 to N in the same way as the lookup operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 647)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 648) The following figure shows an example of two cases holding children::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 649)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 650) --------------> Dir <--------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 651) | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 652) child child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 653)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 654) child - child [hole] - child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 655)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 656) child - child - child [hole] - [hole] - child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 657)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 658) Case 1: Case 2:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 659) Number of children = 6, Number of children = 3,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 660) File size = 7 File size = 7
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 661)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 662) Default Block Allocation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 663) ------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 664)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 665) At runtime, F2FS manages six active logs inside "Main" area: Hot/Warm/Cold node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 666) and Hot/Warm/Cold data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 667)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 668) - Hot node contains direct node blocks of directories.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 669) - Warm node contains direct node blocks except hot node blocks.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 670) - Cold node contains indirect node blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 671) - Hot data contains dentry blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 672) - Warm data contains data blocks except hot and cold data blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 673) - Cold data contains multimedia data or migrated data blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 674)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 675) LFS has two schemes for free space management: threaded log and copy-and-compac-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 676) tion. The copy-and-compaction scheme which is known as cleaning, is well-suited
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 677) for devices showing very good sequential write performance, since free segments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 678) are served all the time for writing new data. However, it suffers from cleaning
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 679) overhead under high utilization. Contrarily, the threaded log scheme suffers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 680) from random writes, but no cleaning process is needed. F2FS adopts a hybrid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 681) scheme where the copy-and-compaction scheme is adopted by default, but the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 682) policy is dynamically changed to the threaded log scheme according to the file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 683) system status.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 684)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 685) In order to align F2FS with underlying flash-based storage, F2FS allocates a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 686) segment in a unit of section. F2FS expects that the section size would be the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 687) same as the unit size of garbage collection in FTL. Furthermore, with respect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 688) to the mapping granularity in FTL, F2FS allocates each section of the active
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 689) logs from different zones as much as possible, since FTL can write the data in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 690) the active logs into one allocation unit according to its mapping granularity.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 691)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 692) Cleaning process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 693) ----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 694)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 695) F2FS does cleaning both on demand and in the background. On-demand cleaning is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 696) triggered when there are not enough free segments to serve VFS calls. Background
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 697) cleaner is operated by a kernel thread, and triggers the cleaning job when the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 698) system is idle.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 699)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 700) F2FS supports two victim selection policies: greedy and cost-benefit algorithms.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 701) In the greedy algorithm, F2FS selects a victim segment having the smallest number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 702) of valid blocks. In the cost-benefit algorithm, F2FS selects a victim segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 703) according to the segment age and the number of valid blocks in order to address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 704) log block thrashing problem in the greedy algorithm. F2FS adopts the greedy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 705) algorithm for on-demand cleaner, while background cleaner adopts cost-benefit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 706) algorithm.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 707)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 708) In order to identify whether the data in the victim segment are valid or not,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 709) F2FS manages a bitmap. Each bit represents the validity of a block, and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 710) bitmap is composed of a bit stream covering whole blocks in main area.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 711)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 712) Write-hint Policy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 713) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 714)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 715) 1) whint_mode=off. F2FS only passes down WRITE_LIFE_NOT_SET.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 716)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 717) 2) whint_mode=user-based. F2FS tries to pass down hints given by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 718) users.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 719)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 720) ===================== ======================== ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 721) User F2FS Block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 722) ===================== ======================== ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 723) N/A META WRITE_LIFE_NOT_SET
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 724) N/A HOT_NODE "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 725) N/A WARM_NODE "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 726) N/A COLD_NODE "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 727) ioctl(COLD) COLD_DATA WRITE_LIFE_EXTREME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 728) extension list " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 729)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 730) -- buffered io
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 731) WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 732) WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 733) WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 734) WRITE_LIFE_NONE " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 735) WRITE_LIFE_MEDIUM " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 736) WRITE_LIFE_LONG " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 737)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 738) -- direct io
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 739) WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 740) WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 741) WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 742) WRITE_LIFE_NONE " WRITE_LIFE_NONE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 743) WRITE_LIFE_MEDIUM " WRITE_LIFE_MEDIUM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 744) WRITE_LIFE_LONG " WRITE_LIFE_LONG
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 745) ===================== ======================== ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 746)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 747) 3) whint_mode=fs-based. F2FS passes down hints with its policy.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 748)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 749) ===================== ======================== ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 750) User F2FS Block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 751) ===================== ======================== ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 752) N/A META WRITE_LIFE_MEDIUM;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 753) N/A HOT_NODE WRITE_LIFE_NOT_SET
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 754) N/A WARM_NODE "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 755) N/A COLD_NODE WRITE_LIFE_NONE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 756) ioctl(COLD) COLD_DATA WRITE_LIFE_EXTREME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 757) extension list " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 758)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 759) -- buffered io
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 760) WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 761) WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 762) WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_LONG
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 763) WRITE_LIFE_NONE " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 764) WRITE_LIFE_MEDIUM " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 765) WRITE_LIFE_LONG " "
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 766)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 767) -- direct io
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 768) WRITE_LIFE_EXTREME COLD_DATA WRITE_LIFE_EXTREME
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 769) WRITE_LIFE_SHORT HOT_DATA WRITE_LIFE_SHORT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 770) WRITE_LIFE_NOT_SET WARM_DATA WRITE_LIFE_NOT_SET
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 771) WRITE_LIFE_NONE " WRITE_LIFE_NONE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 772) WRITE_LIFE_MEDIUM " WRITE_LIFE_MEDIUM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 773) WRITE_LIFE_LONG " WRITE_LIFE_LONG
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 774) ===================== ======================== ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 775)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 776) Fallocate(2) Policy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 777) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 778)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 779) The default policy follows the below POSIX rule.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 780)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 781) Allocating disk space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 782) The default operation (i.e., mode is zero) of fallocate() allocates
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 783) the disk space within the range specified by offset and len. The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 784) file size (as reported by stat(2)) will be changed if offset+len is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 785) greater than the file size. Any subregion within the range specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 786) by offset and len that did not contain data before the call will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 787) initialized to zero. This default behavior closely resembles the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 788) behavior of the posix_fallocate(3) library function, and is intended
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 789) as a method of optimally implementing that function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 790)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 791) However, once F2FS receives ioctl(fd, F2FS_IOC_SET_PIN_FILE) in prior to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 792) fallocate(fd, DEFAULT_MODE), it allocates on-disk block addressess having
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 793) zero or random data, which is useful to the below scenario where:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 794)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 795) 1. create(fd)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 796) 2. ioctl(fd, F2FS_IOC_SET_PIN_FILE)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 797) 3. fallocate(fd, 0, 0, size)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 798) 4. address = fibmap(fd, offset)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 799) 5. open(blkdev)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 800) 6. write(blkdev, address)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 801)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 802) Compression implementation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 803) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 804)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 805) - New term named cluster is defined as basic unit of compression, file can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 806) be divided into multiple clusters logically. One cluster includes 4 << n
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 807) (n >= 0) logical pages, compression size is also cluster size, each of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 808) cluster can be compressed or not.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 809)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 810) - In cluster metadata layout, one special block address is used to indicate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 811) a cluster is a compressed one or normal one; for compressed cluster, following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 812) metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 813) stores data including compress header and compressed data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 814)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 815) - In order to eliminate write amplification during overwrite, F2FS only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 816) support compression on write-once file, data can be compressed only when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 817) all logical blocks in cluster contain valid data and compress ratio of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 818) cluster data is lower than specified threshold.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 819)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 820) - To enable compression on regular inode, there are three ways:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 821)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 822) * chattr +c file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 823) * chattr +c dir; touch dir/file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 824) * mount w/ -o compress_extension=ext; touch file.ext
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 825) * mount w/ -o compress_extension=*; touch any_file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 826)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 827) - At this point, compression feature doesn't expose compressed space to user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 828) directly in order to guarantee potential data updates later to the space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 829) Instead, the main goal is to reduce data writes to flash disk as much as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 830) possible, resulting in extending disk life time as well as relaxing IO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 831) congestion. Alternatively, we've added ioctl interface to reclaim compressed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 832) space and show it to user after putting the immutable bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 833)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 834) Compress metadata layout::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 835)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 836) [Dnode Structure]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 837) +-----------------------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 838) | cluster 1 | cluster 2 | ......... | cluster N |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 839) +-----------------------------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 840) . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 841) . . . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 842) . Compressed Cluster . . Normal Cluster .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 843) +----------+---------+---------+---------+ +---------+---------+---------+---------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 844) |compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 845) +----------+---------+---------+---------+ +---------+---------+---------+---------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 846) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 847) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 848) . .
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 849) +-------------+-------------+----------+----------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 850) | data length | data chksum | reserved | compressed data |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 851) +-------------+-------------+----------+----------------------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 852)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 853) Compression mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 854) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 855)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 856) f2fs supports "fs" and "user" compression modes with "compression_mode" mount option.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 857) With this option, f2fs provides a choice to select the way how to compress the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 858) compression enabled files (refer to "Compression implementation" section for how to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 859) enable compression on a regular inode).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 860)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 861) 1) compress_mode=fs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 862) This is the default option. f2fs does automatic compression in the writeback of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 863) compression enabled files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 864)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 865) 2) compress_mode=user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 866) This disables the automatic compression and gives the user discretion of choosing the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 867) target file and the timing. The user can do manual compression/decompression on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 868) compression enabled files using F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 869) ioctls like the below.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 870)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 871) To decompress a file,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 872)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 873) fd = open(filename, O_WRONLY, 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 874) ret = ioctl(fd, F2FS_IOC_DECOMPRESS_FILE);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 875)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 876) To compress a file,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 877)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 878) fd = open(filename, O_WRONLY, 0);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 879) ret = ioctl(fd, F2FS_IOC_COMPRESS_FILE);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 880)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 881) NVMe Zoned Namespace devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 882) ----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 883)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 884) - ZNS defines a per-zone capacity which can be equal or less than the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 885) zone-size. Zone-capacity is the number of usable blocks in the zone.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 886) F2FS checks if zone-capacity is less than zone-size, if it is, then any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 887) segment which starts after the zone-capacity is marked as not-free in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 888) the free segment bitmap at initial mount time. These segments are marked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 889) as permanently used so they are not allocated for writes and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 890) consequently are not needed to be garbage collected. In case the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 891) zone-capacity is not aligned to default segment size(2MB), then a segment
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 892) can start before the zone-capacity and span across zone-capacity boundary.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 893) Such spanning segments are also considered as usable segments. All blocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 894) past the zone-capacity are considered unusable in these segments.