^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Changes since 2.5.0:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) New helpers: sb_bread(), sb_getblk(), sb_find_get_block(), set_bh(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) sb_set_blocksize() and sb_min_blocksize().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) Use them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) (sb_find_get_block() replaces 2.4's get_hash_table())
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) New methods: ->alloc_inode() and ->destroy_inode().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) Remove inode->u.foo_inode_i
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) Declare::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) struct foo_inode_info {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) /* fs-private stuff */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) struct inode vfs_inode;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) };
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) static inline struct foo_inode_info *FOO_I(struct inode *inode)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) return list_entry(inode, struct foo_inode_info, vfs_inode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) Use FOO_I(inode) instead of &inode->u.foo_inode_i;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) Add foo_alloc_inode() and foo_destroy_inode() - the former should allocate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) foo_inode_info and return the address of ->vfs_inode, the latter should free
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) FOO_I(inode) (see in-tree filesystems for examples).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) Make them ->alloc_inode and ->destroy_inode in your super_operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) Keep in mind that now you need explicit initialization of private data
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) typically between calling iget_locked() and unlocking the inode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) At some point that will become mandatory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) Change of file_system_type method (->read_super to ->get_sb)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) ->read_super() is no more. Ditto for DECLARE_FSTYPE and DECLARE_FSTYPE_DEV.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) Turn your foo_read_super() into a function that would return 0 in case of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) success and negative number in case of error (-EINVAL unless you have more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) informative error value to report). Call it foo_fill_super(). Now declare::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) int foo_get_sb(struct file_system_type *fs_type,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) int flags, const char *dev_name, void *data, struct vfsmount *mnt)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) return get_sb_bdev(fs_type, flags, dev_name, data, foo_fill_super,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) mnt);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) (or similar with s/bdev/nodev/ or s/bdev/single/, depending on the kind of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) filesystem).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) Replace DECLARE_FSTYPE... with explicit initializer and have ->get_sb set as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) foo_get_sb.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) Locking change: ->s_vfs_rename_sem is taken only by cross-directory renames.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) Most likely there is no need to change anything, but if you relied on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) global exclusion between renames for some internal purpose - you need to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) change your internal locking. Otherwise exclusion warranties remain the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) same (i.e. parents and victim are locked, etc.).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) **informational**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) Now we have the exclusion between ->lookup() and directory removal (by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) ->rmdir() and ->rename()). If you used to need that exclusion and do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) it by internal locking (most of filesystems couldn't care less) - you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) can relax your locking.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) ->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) ->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) and ->readdir() are called without BKL now. Grab it on entry, drop upon return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) - that will guarantee the same locking you used to have. If your method or its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) parts do not need BKL - better yet, now you can shift lock_kernel() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) unlock_kernel() so that they would protect exactly what needs to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) protected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) BKL is also moved from around sb operations. BKL should have been shifted into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) individual fs sb_op functions. If you don't need it, remove it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) **informational**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) check for ->link() target not being a directory is done by callers. Feel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) free to drop it...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) **informational**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) ->link() callers hold ->i_mutex on the object we are linking to. Some of your
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) problems might be over...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) new file_system_type method - kill_sb(superblock). If you are converting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) an existing filesystem, set it according to ->fs_flags::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) FS_REQUIRES_DEV - kill_block_super
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) FS_LITTER - kill_litter_super
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) neither - kill_anon_super
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) FS_LITTER is gone - just remove it from fs_flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) FS_SINGLE is gone (actually, that had happened back when ->get_sb()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) went in - and hadn't been documented ;-/). Just remove it from fs_flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) (and see ->get_sb() entry for other actions).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) ->setattr() is called without BKL now. Caller _always_ holds ->i_mutex, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) watch for ->i_mutex-grabbing code that might be used by your ->setattr().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) Callers of notify_change() need ->i_mutex now.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) New super_block field ``struct export_operations *s_export_op`` for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) explicit support for exporting, e.g. via NFS. The structure is fully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) documented at its declaration in include/linux/fs.h, and in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) Documentation/filesystems/nfs/exporting.rst.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) Briefly it allows for the definition of decode_fh and encode_fh operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) to encode and decode filehandles, and allows the filesystem to use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) a standard helper function for decode_fh, and provide file-system specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) support for this helper, particularly get_parent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) It is planned that this will be required for exporting once the code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) settles down a bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) s_export_op is now required for exporting a filesystem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) isofs, ext2, ext3, resierfs, fat
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) can be used as examples of very different filesystems.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) iget4() and the read_inode2 callback have been superseded by iget5_locked()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) which has the following prototype::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) struct inode *iget5_locked(struct super_block *sb, unsigned long ino,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) int (*test)(struct inode *, void *),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) int (*set)(struct inode *, void *),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) void *data);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) 'test' is an additional function that can be used when the inode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) number is not sufficient to identify the actual file object. 'set'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) should be a non-blocking function that initializes those parts of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) newly created inode to allow the test function to succeed. 'data' is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) passed as an opaque value to both test and set functions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) When the inode has been created by iget5_locked(), it will be returned with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) I_NEW flag set and will still be locked. The filesystem then needs to finalize
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) the initialization. Once the inode is initialized it must be unlocked by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) calling unlock_new_inode().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) The filesystem is responsible for setting (and possibly testing) i_ino
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) when appropriate. There is also a simpler iget_locked function that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) just takes the superblock and inode number as arguments and does the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) test and set for you.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) e.g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) inode = iget_locked(sb, ino);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) if (inode->i_state & I_NEW) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) err = read_inode_from_disk(inode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) if (err < 0) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) iget_failed(inode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) return err;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) unlock_new_inode(inode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) Note that if the process of setting up a new inode fails, then iget_failed()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) should be called on the inode to render it dead, and an appropriate error
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) should be passed back to the caller.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) ->getattr() finally getting used. See instances in nfs, minix, etc.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) ->revalidate() is gone. If your filesystem had it - provide ->getattr()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) and let it call whatever you had as ->revlidate() + (for symlinks that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) had ->revalidate()) add calls in ->follow_link()/->readlink().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) ->d_parent changes are not protected by BKL anymore. Read access is safe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) if at least one of the following is true:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) * filesystem has no cross-directory rename()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) * we know that parent had been locked (e.g. we are looking at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) ->d_parent of ->lookup() argument).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) * we are called from ->rename().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) * the child's ->d_lock is held
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) Audit your code and add locking if needed. Notice that any place that is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) not protected by the conditions above is risky even in the old tree - you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) had been relying on BKL and that's prone to screwups. Old tree had quite
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) a few holes of that kind - unprotected access to ->d_parent leading to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) anything from oops to silent memory corruption.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) FS_NOMOUNT is gone. If you use it - just set SB_NOUSER in flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) (see rootfs for one kind of solution and bdev/socket/pipe for another).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) Use bdev_read_only(bdev) instead of is_read_only(kdev). The latter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) is still alive, but only because of the mess in drivers/s390/block/dasd.c.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) As soon as it gets fixed is_read_only() will die.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) ->permission() is called without BKL now. Grab it on entry, drop upon
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) return - that will guarantee the same locking you used to have. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) your method or its parts do not need BKL - better yet, now you can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) shift lock_kernel() and unlock_kernel() so that they would protect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) exactly what needs to be protected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) ->statfs() is now called without BKL held. BKL should have been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) shifted into individual fs sb_op functions where it's not clear that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) it's safe to remove it. If you don't need it, remove it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) is_read_only() is gone; use bdev_read_only() instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) destroy_buffers() is gone; use invalidate_bdev().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) fsync_dev() is gone; use fsync_bdev(). NOTE: lvm breakage is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) deliberate; as soon as struct block_device * is propagated in a reasonable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) way by that code fixing will become trivial; until then nothing can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) done.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) block truncatation on error exit from ->write_begin, and ->direct_IO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) moved from generic methods (block_write_begin, cont_write_begin,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) nobh_write_begin, blockdev_direct_IO*) to callers. Take a look at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) ext2_write_failed and callers for an example.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) ->truncate is gone. The whole truncate sequence needs to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) implemented in ->setattr, which is now mandatory for filesystems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) implementing on-disk size changes. Start with a copy of the old inode_setattr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) and vmtruncate, and the reorder the vmtruncate + foofs_vmtruncate sequence to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) be in order of zeroing blocks using block_truncate_page or similar helpers,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) size update and on finally on-disk truncation which should not fail.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) setattr_prepare (which used to be inode_change_ok) now includes the size checks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) for ATTR_SIZE and must be called in the beginning of ->setattr unconditionally.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) ->clear_inode() and ->delete_inode() are gone; ->evict_inode() should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) be used instead. It gets called whenever the inode is evicted, whether it has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) remaining links or not. Caller does *not* evict the pagecache or inode-associated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) metadata buffers; the method has to use truncate_inode_pages_final() to get rid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) of those. Caller makes sure async writeback cannot be running for the inode while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) (or after) ->evict_inode() is called.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) ->drop_inode() returns int now; it's called on final iput() with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) inode->i_lock held and it returns true if filesystems wants the inode to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) dropped. As before, generic_drop_inode() is still the default and it's been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) updated appropriately. generic_delete_inode() is also alive and it consists
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) simply of return 1. Note that all actual eviction work is done by caller after
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) ->drop_inode() returns.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) As before, clear_inode() must be called exactly once on each call of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) ->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) before, if you are using inode-associated metadata buffers (i.e.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) mark_buffer_dirty_inode()), it's your responsibility to call
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) invalidate_inode_buffers() before clear_inode().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) NOTE: checking i_nlink in the beginning of ->write_inode() and bailing out
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) free the on-disk inode, you may end up doing that while ->write_inode() is writing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) to it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) .d_delete() now only advises the dcache as to whether or not to cache
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) unreferenced dentries, and is now only called when the dentry refcount goes to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 0. Even on 0 refcount transition, it must be able to tolerate being called 0,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 1, or more times (eg. constant, idempotent).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) .d_compare() calling convention and locking rules are significantly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) changed. Read updated documentation in Documentation/filesystems/vfs.rst (and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) look at examples of other filesystems) for guidance.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) .d_hash() calling convention and locking rules are significantly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) changed. Read updated documentation in Documentation/filesystems/vfs.rst (and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) look at examples of other filesystems) for guidance.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) dcache_lock is gone, replaced by fine grained locks. See fs/dcache.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) for details of what locks to replace dcache_lock with in order to protect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) particular things. Most of the time, a filesystem only needs ->d_lock, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) protects *all* the dcache state of a given dentry.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) Filesystems must RCU-free their inodes, if they can have been accessed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) via rcu-walk path walk (basically, if the file can have had a path name in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) vfs namespace).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) Even though i_dentry and i_rcu share storage in a union, we will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) initialize the former in inode_init_always(), so just leave it alone in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) the callback. It used to be necessary to clean it there, but not anymore
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) (starting at 3.2).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) vfs now tries to do path walking in "rcu-walk mode", which avoids
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) atomic operations and scalability hazards on dentries and inodes (see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) Documentation/filesystems/path-lookup.txt). d_hash and d_compare changes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) (above) are examples of the changes required to support this. For more complex
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) filesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) no changes are required to the filesystem. However, this is costly and loses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) the benefits of rcu-walk mode. We will begin to add filesystem callbacks that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) are rcu-walk aware, shown below. Filesystems should take advantage of this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) where possible.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) d_revalidate is a callback that is made on every path element (if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) the filesystem provides it), which requires dropping out of rcu-walk mode. This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) may now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) returned if the filesystem cannot handle rcu-walk. See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) Documentation/filesystems/vfs.rst for more details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) permission is an inode permission check that is called on many or all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) directory inodes on the way down a path walk (to check for exec permission). It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) must now be rcu-walk aware (mask & MAY_NOT_BLOCK). See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) Documentation/filesystems/vfs.rst for more details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) In ->fallocate() you must check the mode option passed in. If your
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) filesystem does not support hole punching (deallocating space in the middle of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) so the i_size should not change when hole punching, even when puching the end of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) a file off.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) ->get_sb() is gone. Switch to use of ->mount(). Typically it's just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) a matter of switching from calling ``get_sb_``... to ``mount_``... and changing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) the function type. If you were doing it manually, just switch from setting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448) ->mnt_root to some pointer to returning that pointer. On errors return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) ERR_PTR(...).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) ->permission() and generic_permission()have lost flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) argument; instead of passing IPERM_FLAG_RCU we add MAY_NOT_BLOCK into mask.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) generic_permission() has also lost the check_acl argument; ACL checking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) has been taken to VFS and filesystems need to provide a non-NULL ->i_op->get_acl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) to read an ACL from disk.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) If you implement your own ->llseek() you must handle SEEK_HOLE and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) SEEK_DATA. You can hanle this by returning -EINVAL, but it would be nicer to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) support it in some way. The generic handler assumes that the entire file is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) data and there is a virtual hole at the end of the file. So if the provided
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) offset is less than i_size and SEEK_DATA is specified, return the same offset.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471) If the above is true for the offset and you are given SEEK_HOLE, return the end
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) of the file. If the offset is i_size or greater return -ENXIO in either case.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476) If you have your own ->fsync() you must make sure to call
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) filemap_write_and_wait_range() so that all dirty pages are synced out properly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) You must also keep in mind that ->fsync() is not called with i_mutex held
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) anymore, so if you require i_mutex locking you must make sure to take it and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) release it yourself.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) d_alloc_root() is gone, along with a lot of bugs caused by code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) misusing it. Replacement: d_make_root(inode). On success d_make_root(inode)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) allocates and returns a new dentry instantiated with the passed in inode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) On failure NULL is returned and the passed in inode is dropped so the reference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490) to inode is consumed in all cases and failure handling need not do any cleanup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) for the inode. If d_make_root(inode) is passed a NULL inode it returns NULL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) and also requires no further error handling. Typical usage is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) inode = foofs_new_inode(....);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) s->s_root = d_make_root(inode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) if (!s->s_root)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) /* Nothing needed for the inode cleanup */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498) return -ENOMEM;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499) ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) The witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) ->lookup() do *not* take struct nameidata anymore; just the flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) ->create() doesn't take ``struct nameidata *``; unlike the previous
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) two, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) local filesystems can ignore tha argument - they are guaranteed that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) object doesn't exist. It's remote/distributed ones that might care...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) FS_REVAL_DOT is gone; if you used to have it, add ->d_weak_revalidate()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) in your dentry operations instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) vfs_readdir() is gone; switch to iterate_dir() instead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) ->readdir() is gone now; switch to ->iterate()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538) vfs_follow_link has been removed. Filesystems must use nd_set_link
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539) from ->follow_link for normal symlinks, or nd_jump_link for magic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) /proc/<pid> style links.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546) iget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) called with both ->i_lock and inode_hash_lock held; the former is *not*
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548) taken anymore, so verify that your callbacks do not rely on it (none
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549) of the in-tree instances did). inode_hash_lock is still held,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550) of course, so they are still serialized wrt removal from inode hash,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551) as well as wrt set() callback of iget5_locked().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557) d_materialise_unique() is gone; d_splice_alias() does everything you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) need now. Remember that they have opposite orders of arguments ;-/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564) f_dentry is gone; use f_path.dentry, or, better yet, see if you can avoid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 565) it entirely.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 566)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 567) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 568)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 569) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 570)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 571) never call ->read() and ->write() directly; use __vfs_{read,write} or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 572) wrappers; instead of checking for ->write or ->read being NULL, look for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 573) FMODE_CAN_{WRITE,READ} in file->f_mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 574)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 575) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 576)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 577) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 578)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 579) do _not_ use new_sync_{read,write} for ->read/->write; leave it NULL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 580) instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 581)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 582) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 583)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 584) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 585) ->aio_read/->aio_write are gone. Use ->read_iter/->write_iter.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 586)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 587) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 588)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 589) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 590)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 591) for embedded ("fast") symlinks just set inode->i_link to wherever the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 592) symlink body is and use simple_follow_link() as ->follow_link().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 593)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 594) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 595)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 596) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 597)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 598) calling conventions for ->follow_link() have changed. Instead of returning
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 599) cookie and using nd_set_link() to store the body to traverse, we return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 600) the body to traverse and store the cookie using explicit void ** argument.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 601) nameidata isn't passed at all - nd_jump_link() doesn't need it and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 602) nd_[gs]et_link() is gone.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 603)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 604) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 605)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 606) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 607)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 608) calling conventions for ->put_link() have changed. It gets inode instead of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 609) dentry, it does not get nameidata at all and it gets called only when cookie
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 610) is non-NULL. Note that link body isn't available anymore, so if you need it,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 611) store it as cookie.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 612)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 613) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 614)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 615) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 616)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 617) any symlink that might use page_follow_link_light/page_put_link() must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 618) have inode_nohighmem(inode) called before anything might start playing with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 619) its pagecache. No highmem pages should end up in the pagecache of such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 620) symlinks. That includes any preseeding that might be done during symlink
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 621) creation. __page_symlink() will honour the mapping gfp flags, so once
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 622) you've done inode_nohighmem() it's safe to use, but if you allocate and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 623) insert the page manually, make sure to use the right gfp flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 624)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 625) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 626)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 627) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 628)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 629) ->follow_link() is replaced with ->get_link(); same API, except that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 630)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 631) * ->get_link() gets inode as a separate argument
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 632) * ->get_link() may be called in RCU mode - in that case NULL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 633) dentry is passed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 634)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 635) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 636)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 637) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 638)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 639) ->get_link() gets struct delayed_call ``*done`` now, and should do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 640) set_delayed_call() where it used to set ``*cookie``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 641)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 642) ->put_link() is gone - just give the destructor to set_delayed_call()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 643) in ->get_link().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 644)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 645) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 646)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 647) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 648)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 649) ->getxattr() and xattr_handler.get() get dentry and inode passed separately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 650) dentry might be yet to be attached to inode, so do _not_ use its ->d_inode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 651) in the instances. Rationale: !@#!@# security_d_instantiate() needs to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 652) called before we attach dentry to inode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 653)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 654) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 655)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 656) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 657)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 658) symlinks are no longer the only inodes that do *not* have i_bdev/i_cdev/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 659) i_pipe/i_link union zeroed out at inode eviction. As the result, you can't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 660) assume that non-NULL value in ->i_nlink at ->destroy_inode() implies that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 661) it's a symlink. Checking ->i_mode is really needed now. In-tree we had
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 662) to fix shmem_destroy_callback() that used to take that kind of shortcut;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 663) watch out, since that shortcut is no longer valid.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 664)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 665) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 666)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 667) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 668)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 669) ->i_mutex is replaced with ->i_rwsem now. inode_lock() et.al. work as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 670) they used to - they just take it exclusive. However, ->lookup() may be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 671) called with parent locked shared. Its instances must not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 672)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 673) * use d_instantiate) and d_rehash() separately - use d_add() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 674) d_splice_alias() instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 675) * use d_rehash() alone - call d_add(new_dentry, NULL) instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 676) * in the unlikely case when (read-only) access to filesystem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 677) data structures needs exclusion for some reason, arrange it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 678) yourself. None of the in-tree filesystems needed that.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 679) * rely on ->d_parent and ->d_name not changing after dentry has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 680) been fed to d_add() or d_splice_alias(). Again, none of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 681) in-tree instances relied upon that.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 682)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 683) We are guaranteed that lookups of the same name in the same directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 684) will not happen in parallel ("same" in the sense of your ->d_compare()).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 685) Lookups on different names in the same directory can and do happen in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 686) parallel now.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 687)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 688) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 689)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 690) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 691)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 692) ->iterate_shared() is added; it's a parallel variant of ->iterate().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 693) Exclusion on struct file level is still provided (as well as that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 694) between it and lseek on the same struct file), but if your directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 695) has been opened several times, you can get these called in parallel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 696) Exclusion between that method and all directory-modifying ones is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 697) still provided, of course.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 698)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 699) Often enough ->iterate() can serve as ->iterate_shared() without any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 700) changes - it is a read-only operation, after all. If you have any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 701) per-inode or per-dentry in-core data structures modified by ->iterate(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 702) you might need something to serialize the access to them. If you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 703) do dcache pre-seeding, you'll need to switch to d_alloc_parallel() for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 704) that; look for in-tree examples.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 705)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 706) Old method is only used if the new one is absent; eventually it will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 707) be removed. Switch while you still can; the old one won't stay.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 708)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 709) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 710)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 711) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 712)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 713) ->atomic_open() calls without O_CREAT may happen in parallel.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 714)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 715) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 716)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 717) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 718)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 719) ->setxattr() and xattr_handler.set() get dentry and inode passed separately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 720) dentry might be yet to be attached to inode, so do _not_ use its ->d_inode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 721) in the instances. Rationale: !@#!@# security_d_instantiate() needs to be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 722) called before we attach dentry to inode and !@#!@##!@$!$#!@#$!@$!@$ smack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 723) ->d_instantiate() uses not just ->getxattr() but ->setxattr() as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 724)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 725) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 726)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 727) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 728)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 729) ->d_compare() doesn't get parent as a separate argument anymore. If you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 730) used it for finding the struct super_block involved, dentry->d_sb will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 731) work just as well; if it's something more complicated, use dentry->d_parent.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 732) Just be careful not to assume that fetching it more than once will yield
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 733) the same value - in RCU mode it could change under you.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 734)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 735) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 736)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 737) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 738)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 739) ->rename() has an added flags argument. Any flags not handled by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 740) filesystem should result in EINVAL being returned.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 741)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 742) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 743)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 744)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 745) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 746)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 747) ->readlink is optional for symlinks. Don't set, unless filesystem needs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 748) to fake something for readlink(2).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 749)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 750) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 751)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 752) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 753)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 754) ->getattr() is now passed a struct path rather than a vfsmount and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 755) dentry separately, and it now has request_mask and query_flags arguments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 756) to specify the fields and sync type requested by statx. Filesystems not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 757) supporting any statx-specific features may ignore the new arguments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 758)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 759) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 760)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 761) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 762)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 763) ->atomic_open() calling conventions have changed. Gone is ``int *opened``,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 764) along with FILE_OPENED/FILE_CREATED. In place of those we have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 765) FMODE_OPENED/FMODE_CREATED, set in file->f_mode. Additionally, return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 766) value for 'called finish_no_open(), open it yourself' case has become
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 767) 0, not 1. Since finish_no_open() itself is returning 0 now, that part
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 768) does not need any changes in ->atomic_open() instances.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 769)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 770) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 771)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 772) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 773)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 774) alloc_file() has become static now; two wrappers are to be used instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 775) alloc_file_pseudo(inode, vfsmount, name, flags, ops) is for the cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 776) when dentry needs to be created; that's the majority of old alloc_file()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 777) users. Calling conventions: on success a reference to new struct file
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 778) is returned and callers reference to inode is subsumed by that. On
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 779) failure, ERR_PTR() is returned and no caller's references are affected,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 780) so the caller needs to drop the inode reference it held.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 781) alloc_file_clone(file, flags, ops) does not affect any caller's references.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 782) On success you get a new struct file sharing the mount/dentry with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 783) original, on failure - ERR_PTR().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 784)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 785) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 786)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 787) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 788)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 789) ->clone_file_range() and ->dedupe_file_range have been replaced with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 790) ->remap_file_range(). See Documentation/filesystems/vfs.rst for more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 791) information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 792)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 793) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 794)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 795) **recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 796)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 797) ->lookup() instances doing an equivalent of::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 798)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 799) if (IS_ERR(inode))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 800) return ERR_CAST(inode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 801) return d_splice_alias(inode, dentry);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 802)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 803) don't need to bother with the check - d_splice_alias() will do the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 804) right thing when given ERR_PTR(...) as inode. Moreover, passing NULL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 805) inode to d_splice_alias() will also do the right thing (equivalent of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 806) d_add(dentry, NULL); return NULL;), so that kind of special cases
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 807) also doesn't need a separate treatment.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 808)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 809) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 810)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 811) **strongly recommended**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 812)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 813) take the RCU-delayed parts of ->destroy_inode() into a new method -
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 814) ->free_inode(). If ->destroy_inode() becomes empty - all the better,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 815) just get rid of it. Synchronous work (e.g. the stuff that can't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 816) be done from an RCU callback, or any WARN_ON() where we want the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 817) stack trace) *might* be movable to ->evict_inode(); however,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 818) that goes only for the things that are not needed to balance something
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 819) done by ->alloc_inode(). IOW, if it's cleaning up the stuff that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 820) might have accumulated over the life of in-core inode, ->evict_inode()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 821) might be a fit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 822)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 823) Rules for inode destruction:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 824)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 825) * if ->destroy_inode() is non-NULL, it gets called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 826) * if ->free_inode() is non-NULL, it gets scheduled by call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 827) * combination of NULL ->destroy_inode and NULL ->free_inode is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 828) treated as NULL/free_inode_nonrcu, to preserve the compatibility.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 829)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 830) Note that the callback (be it via ->free_inode() or explicit call_rcu()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 831) in ->destroy_inode()) is *NOT* ordered wrt superblock destruction;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 832) as the matter of fact, the superblock and all associated structures
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 833) might be already gone. The filesystem driver is guaranteed to be still
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 834) there, but that's it. Freeing memory in the callback is fine; doing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 835) more than that is possible, but requires a lot of care and is best
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 836) avoided.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 837)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 838) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 839)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 840) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 841)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 842) DCACHE_RCUACCESS is gone; having an RCU delay on dentry freeing is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 843) default. DCACHE_NORCU opts out, and only d_alloc_pseudo() has any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 844) business doing so.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 845)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 846) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 847)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 848) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 849)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 850) d_alloc_pseudo() is internal-only; uses outside of alloc_file_pseudo() are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 851) very suspect (and won't work in modules). Such uses are very likely to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 852) be misspelled d_alloc_anon().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 853)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 854) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 855)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 856) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 857)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 858) [should've been added in 2016] stale comment in finish_open() nonwithstanding,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 859) failure exits in ->atomic_open() instances should *NOT* fput() the file,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 860) no matter what. Everything is handled by the caller.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 861)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 862) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 863)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 864) **mandatory**
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 865)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 866) clone_private_mount() returns a longterm mount now, so the proper destructor of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 867) its result is kern_unmount() or kern_unmount_array().