^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) =======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Locking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) The text below describes the locking rules for VFS-related methods.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) It is (believed to be) up-to-date. *Please*, if you change anything in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) prototypes or locking protocols - update this file. And update the relevant
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) instances in the tree, don't leave that to maintainers of filesystems/devices/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) etc. At the very least, put the list of dubious cases in the end of this file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Don't turn it into log - maintainers of out-of-the-tree code are supposed to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) be able to use diff(1).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) Thing currently missing here: socket operations. Alexey?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) dentry_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) int (*d_revalidate)(struct dentry *, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) int (*d_weak_revalidate)(struct dentry *, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) int (*d_hash)(const struct dentry *, struct qstr *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) int (*d_compare)(const struct dentry *,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) unsigned int, const char *, const struct qstr *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) int (*d_delete)(struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) int (*d_init)(struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) void (*d_release)(struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) void (*d_iput)(struct dentry *, struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) struct vfsmount *(*d_automount)(struct path *path);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) int (*d_manage)(const struct path *, bool);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) struct dentry *(*d_real)(struct dentry *, const struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) ================== =========== ======== ============== ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) ops rename_lock ->d_lock may block rcu-walk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) ================== =========== ======== ============== ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) d_revalidate: no no yes (ref-walk) maybe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) d_weak_revalidate: no no yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) d_hash no no no maybe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) d_compare: yes no no maybe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) d_delete: no yes no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) d_init: no no yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) d_release: no no yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) d_prune: no yes no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) d_iput: no no yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) d_dname: no no no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) d_automount: no no yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) d_manage: no no yes (ref-walk) maybe
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) d_real no no yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) ================== =========== ======== ============== ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) inode_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) int (*create) (struct inode *,struct dentry *,umode_t, bool);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) int (*link) (struct dentry *,struct inode *,struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) int (*unlink) (struct inode *,struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) int (*symlink) (struct inode *,struct dentry *,const char *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) int (*mkdir) (struct inode *,struct dentry *,umode_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) int (*rmdir) (struct inode *,struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) int (*rename) (struct inode *, struct dentry *,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) struct inode *, struct dentry *, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) int (*readlink) (struct dentry *, char __user *,int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) void (*truncate) (struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) int (*permission) (struct inode *, int, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) int (*get_acl)(struct inode *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) int (*setattr) (struct dentry *, struct iattr *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) int (*getattr) (const struct path *, struct kstat *, u32, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) ssize_t (*listxattr) (struct dentry *, char *, size_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, u64 len);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) void (*update_time)(struct inode *, struct timespec *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) int (*atomic_open)(struct inode *, struct dentry *,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) struct file *, unsigned open_flag,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) umode_t create_mode);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) int (*tmpfile) (struct inode *, struct dentry *, umode_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) all may block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) ============ =============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) ops i_rwsem(inode)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) ============ =============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) lookup: shared
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) create: exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92) link: exclusive (both)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93) mknod: exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) symlink: exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) mkdir: exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) unlink: exclusive (both)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) rmdir: exclusive (both)(see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) rename: exclusive (all) (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) readlink: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) get_link: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) setattr: exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) permission: no (may not block if called in rcu-walk mode)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) get_acl: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) getattr: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) listxattr: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) fiemap: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) update_time: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) atomic_open: shared (exclusive if O_CREAT is set in open flags)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) tmpfile: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) ============ =============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_rwsem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) exclusive on victim.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) See Documentation/filesystems/directory-locking.rst for more detailed discussion
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) of the locking scheme for directory operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) xattr_handler operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) bool (*list)(struct dentry *dentry);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) struct inode *inode, const char *name, void *buffer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) size_t size, int flags);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) int (*set)(const struct xattr_handler *handler, struct dentry *dentry,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) struct inode *inode, const char *name, const void *buffer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) size_t size, int flags);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) all may block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) ===== ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) ops i_rwsem(inode)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) ===== ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) list: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) get: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) set: exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) ===== ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) super_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) struct inode *(*alloc_inode)(struct super_block *sb);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) void (*free_inode)(struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) void (*destroy_inode)(struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) void (*dirty_inode) (struct inode *, int flags);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) int (*write_inode) (struct inode *, struct writeback_control *wbc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) int (*drop_inode) (struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) void (*evict_inode) (struct inode *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) void (*put_super) (struct super_block *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) int (*sync_fs)(struct super_block *sb, int wait);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) int (*freeze_fs) (struct super_block *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) int (*unfreeze_fs) (struct super_block *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) int (*statfs) (struct dentry *, struct kstatfs *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) int (*remount_fs) (struct super_block *, int *, char *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) void (*umount_begin) (struct super_block *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) int (*show_options)(struct seq_file *, struct dentry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) All may block [not true, see below]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) ====================== ============ ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) ops s_umount note
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) ====================== ============ ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) alloc_inode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) free_inode: called from RCU callback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) destroy_inode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) dirty_inode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) write_inode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) drop_inode: !!!inode->i_lock!!!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) evict_inode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) put_super: write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) sync_fs: read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) freeze_fs: write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) unfreeze_fs: write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) statfs: maybe(read) (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) remount_fs: write
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) umount_begin: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) show_options: no (namespace_sem)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) quota_read: no (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) quota_write: no (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) bdev_try_to_free_page: no (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) ====================== ============ ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) ->statfs() has s_umount (shared) when called by ustat(2) (native or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) compat), but that's an accident of bad API; s_umount is used to pin
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) the superblock down when we only have dev_t given us by userland to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) identify the superblock. Everything else (statfs(), fstatfs(), etc.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) doesn't hold it when calling ->statfs() - superblock is pinned down
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) by resolving the pathname passed to syscall.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) ->quota_read() and ->quota_write() functions are both guaranteed to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) be the only ones operating on the quota file by the quota code (via
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) dqio_sem) (unless an admin really wants to screw up something and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) writes to quota files with quotas on). For other details about locking
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) see also dquot_operations section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) ->bdev_try_to_free_page is called from the ->releasepage handler of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) the block device inode. See there for more details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) file_system_type
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) struct dentry *(*mount) (struct file_system_type *, int,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) const char *, void *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) void (*kill_sb) (struct super_block *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) ======= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) ops may block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) ======= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) mount yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) kill_sb yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) ======= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) ->mount() returns ERR_PTR or the root dentry; its superblock should be locked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) on return.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) ->kill_sb() takes a write-locked superblock, does all shutdown work on it,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) unlocks and drops the reference.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) address_space_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) int (*writepage)(struct page *page, struct writeback_control *wbc);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) int (*readpage)(struct file *, struct page *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) int (*writepages)(struct address_space *, struct writeback_control *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) int (*set_page_dirty)(struct page *page);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) void (*readahead)(struct readahead_control *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) int (*readpages)(struct file *filp, struct address_space *mapping,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) struct list_head *pages, unsigned nr_pages);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) int (*write_begin)(struct file *, struct address_space *mapping,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) loff_t pos, unsigned len, unsigned flags,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) struct page **pagep, void **fsdata);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) int (*write_end)(struct file *, struct address_space *mapping,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) loff_t pos, unsigned len, unsigned copied,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) struct page *page, void *fsdata);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) sector_t (*bmap)(struct address_space *, sector_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) void (*invalidatepage) (struct page *, unsigned int, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) int (*releasepage) (struct page *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) void (*freepage)(struct page *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) bool (*isolate_page) (struct page *, isolate_mode_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) int (*migratepage)(struct address_space *, struct page *, struct page *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) void (*putback_page) (struct page *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) int (*launder_page)(struct page *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) int (*error_remove_page)(struct address_space *, struct page *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) int (*swap_activate)(struct file *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) int (*swap_deactivate)(struct file *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) All except set_page_dirty and freepage may block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) ====================== ======================== =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) ops PageLocked(page) i_rwsem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) ====================== ======================== =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) writepage: yes, unlocks (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) readpage: yes, unlocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) writepages:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) set_page_dirty no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) readahead: yes, unlocks
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) readpages: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) write_begin: locks the page exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) write_end: yes, unlocks exclusive
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) bmap:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) invalidatepage: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) releasepage: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) freepage: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) direct_IO:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) isolate_page: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) migratepage: yes (both)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) putback_page: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) launder_page: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) is_partially_uptodate: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) error_remove_page: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) swap_activate: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) swap_deactivate: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) ====================== ======================== =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) ->write_begin(), ->write_end() and ->readpage() may be called from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) the request handler (/dev/loop).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) ->readpage() unlocks the page, either synchronously or via I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) completion.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) ->readahead() unlocks the pages that I/O is attempted on like ->readpage().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) ->readpages() populates the pagecache with the passed pages and starts
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) I/O against them. They come unlocked upon I/O completion.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) ->writepage() is used for two purposes: for "memory cleansing" and for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) "sync". These are quite different operations and the behaviour may differ
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) depending upon the mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) If writepage is called for sync (wbc->sync_mode != WBC_SYNC_NONE) then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) it *must* start I/O against the page, even if that would involve
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) blocking on in-progress I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) If writepage is called for memory cleansing (sync_mode ==
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) WBC_SYNC_NONE) then its role is to get as much writeout underway as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) possible. So writepage should try to avoid blocking against
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) currently-in-progress I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) If the filesystem is not called for "sync" and it determines that it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) would need to block against in-progress I/O to be able to start new I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) against the page the filesystem should redirty the page with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) redirty_page_for_writepage(), then unlock the page and return zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) This may also be done to avoid internal deadlocks, but rarely.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) If the filesystem is called for sync then it must wait on any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) in-progress I/O and then start new I/O.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) The filesystem should unlock the page synchronously, before returning to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) caller, unless ->writepage() returns special WRITEPAGE_ACTIVATE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) value. WRITEPAGE_ACTIVATE means that page cannot really be written out
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) currently, and VM should stop calling ->writepage() on this page for some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) time. VM does this by moving page to the head of the active list, hence the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) name.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) Unless the filesystem is going to redirty_page_for_writepage(), unlock the page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) and return zero, writepage *must* run set_page_writeback() against the page,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) followed by unlocking it. Once set_page_writeback() has been run against the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) page, write I/O can be submitted and the write I/O completion handler must run
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) end_page_writeback() once the I/O is complete. If no I/O is submitted, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) filesystem must run end_page_writeback() against the page before returning from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) writepage.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) That is: after 2.5.12, pages which are under writeout are *not* locked. Note,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) if the filesystem needs the page to be locked during writeout, that is ok, too,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) the page is allowed to be unlocked at any point in time between the calls to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) set_page_writeback() and end_page_writeback().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) Note, failure to run either redirty_page_for_writepage() or the combination of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) set_page_writeback()/end_page_writeback() on a page submitted to writepage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) will leave the page itself marked clean but it will be tagged as dirty in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) radix tree. This incoherency can lead to all sorts of hard-to-debug problems
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) in the filesystem like having dirty inodes at umount and losing written data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) ->writepages() is used for periodic writeback and for syscall-initiated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) sync operations. The address_space should start I/O against at least
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) ``*nr_to_write`` pages. ``*nr_to_write`` must be decremented for each page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) which is written. The address_space implementation may write more (or less)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) pages than ``*nr_to_write`` asks for, but it should try to be reasonably close.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) If nr_to_write is NULL, all dirty pages must be written.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) writepages should _only_ write pages which are present on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) mapping->io_pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) ->set_page_dirty() is called from various places in the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) when the target page is marked as needing writeback. It may be called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) under spinlock (it cannot block) and is sometimes called with the page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) not locked.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) ->bmap() is currently used by legacy ioctl() (FIBMAP) provided by some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) filesystems and by the swapper. The latter will eventually go away. Please,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) keep it that way and don't breed new callers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) ->invalidatepage() is called when the filesystem must attempt to drop
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) some or all of the buffers from the page when it is being truncated. It
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) returns zero on success. If ->invalidatepage is zero, the kernel uses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) block_invalidatepage() instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) ->releasepage() is called when the kernel is about to try to drop the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) buffers from the page in preparation for freeing it. It returns zero to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) indicate that the buffers are (or may be) freeable. If ->releasepage is zero,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) the kernel assumes that the fs has no private interest in the buffers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) ->freepage() is called when the kernel is done dropping the page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) from the page cache.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) ->launder_page() may be called prior to releasing a page if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) it is still found to be dirty. It returns zero if the page was successfully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) cleaned, or an error value if not. Note that in order to prevent the page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) getting mapped back in and redirtied, it needs to be kept locked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) across the entire operation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) ->swap_activate will be called with a non-zero argument on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) files backing (non block device backed) swapfiles. A return value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) of zero indicates success, in which case this file can be used for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) backing swapspace. The swapspace operations will be proxied to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) address space operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) ->swap_deactivate() will be called in the sys_swapoff()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) path after ->swap_activate() returned success.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) file_lock_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) void (*fl_release_private)(struct file_lock *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) =================== ============= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) ops inode->i_lock may block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) =================== ============= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) fl_copy_lock: yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) fl_release_private: maybe maybe[1]_
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) =================== ============= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) .. [1]:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) ->fl_release_private for flock or POSIX locks is currently allowed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) to block. Leases however can still be freed while the i_lock is held and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) so fl_release_private called on a lease should not block.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) lock_manager_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) void (*lm_notify)(struct file_lock *); /* unblock callback */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) int (*lm_grant)(struct file_lock *, struct file_lock *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) void (*lm_break)(struct file_lock *); /* break_lease callback */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) int (*lm_change)(struct file_lock **, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) bool (*lm_breaker_owns_lease)(struct file_lock *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) ====================== ============= ================= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) ops inode->i_lock blocked_lock_lock may block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) ====================== ============= ================= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) lm_notify: yes yes no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) lm_grant: no no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441) lm_break: yes no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) lm_change yes no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) lm_breaker_owns_lease: no no no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444) ====================== ============= ================= =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) buffer_head
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) void (*b_end_io)(struct buffer_head *bh, int uptodate);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) called from interrupts. In other words, extreme care is needed here.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) bh is locked, but that's all warranties we have here. Currently only RAID1,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) highmem, fs/buffer.c, and fs/ntfs/aops.c are providing these. Block devices
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) call this method upon the IO completion.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) block_device_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) int (*open) (struct block_device *, fmode_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) int (*release) (struct gendisk *, fmode_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) int (*direct_access) (struct block_device *, sector_t, void **,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) unsigned long *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) void (*unlock_native_capacity) (struct gendisk *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471) int (*revalidate_disk) (struct gendisk *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) int (*getgeo)(struct block_device *, struct hd_geometry *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) void (*swap_slot_free_notify) (struct block_device *, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) ======================= ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) ops bd_mutex
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479) ======================= ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) open: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481) release: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482) ioctl: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) compat_ioctl: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484) direct_access: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485) unlock_native_capacity: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486) revalidate_disk: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) getgeo: no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) swap_slot_free_notify: no (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489) ======================= ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) swap_slot_free_notify is called with swap_lock and sometimes the page lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) held.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) file_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500) loff_t (*llseek) (struct file *, loff_t, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502) ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503) ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504) ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505) int (*iterate) (struct file *, struct dir_context *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) int (*iterate_shared) (struct file *, struct dir_context *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) __poll_t (*poll) (struct file *, struct poll_table_struct *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509) long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) int (*mmap) (struct file *, struct vm_area_struct *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) int (*open) (struct inode *, struct file *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) int (*flush) (struct file *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) int (*release) (struct inode *, struct file *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) int (*fsync) (struct file *, loff_t start, loff_t end, int datasync);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) int (*fasync) (int, struct file *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516) int (*lock) (struct file *, int, struct file_lock *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) ssize_t (*readv) (struct file *, const struct iovec *, unsigned long,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518) loff_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) ssize_t (*writev) (struct file *, const struct iovec *, unsigned long,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) loff_t *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) void __user *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) ssize_t (*sendpage) (struct file *, struct page *, int, size_t,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) loff_t *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525) unsigned long (*get_unmapped_area)(struct file *, unsigned long,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526) unsigned long, unsigned long, unsigned long);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) int (*check_flags)(int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) int (*flock) (struct file *, int, struct file_lock *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529) ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530) size_t, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531) ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532) size_t, unsigned int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533) int (*setlease)(struct file *, long, struct file_lock **, void **);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) long (*fallocate)(struct file *, int, loff_t, loff_t);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537) All may block.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539) ->llseek() locking has moved from llseek to the individual llseek
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) implementations. If your fs is not using generic_file_llseek, you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541) need to acquire and release the appropriate locks in your ->llseek().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) For many filesystems, it is probably safe to acquire the inode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543) mutex or just to use i_size_read() instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) Note: this does not protect the file->f_pos against concurrent modifications
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545) since this is something the userspace has to take care about.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 546)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 547) ->iterate() is called with i_rwsem exclusive.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 548)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 549) ->iterate_shared() is called with i_rwsem at least shared.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 550)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 551) ->fasync() is responsible for maintaining the FASYNC bit in filp->f_flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 552) Most instances call fasync_helper(), which does that maintenance, so it's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 553) not normally something one needs to worry about. Return values > 0 will be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 554) mapped to zero in the VFS layer.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 555)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 556) ->readdir() and ->ioctl() on directories must be changed. Ideally we would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 557) move ->readdir() to inode_operations and use a separate method for directory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 558) ->ioctl() or kill the latter completely. One of the problems is that for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 559) anything that resembles union-mount we won't have a struct file for all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 560) components. And there are other reasons why the current interface is a mess...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 561)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 562) ->read on directories probably must go away - we should just enforce -EISDIR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 563) in sys_read() and friends.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 564)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 565) ->setlease operations should call generic_setlease() before or after setting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 566) the lease within the individual filesystem to record the result of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 567) operation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 568)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 569) dquot_operations
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 570) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 571)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 572) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 573)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 574) int (*write_dquot) (struct dquot *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 575) int (*acquire_dquot) (struct dquot *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 576) int (*release_dquot) (struct dquot *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 577) int (*mark_dirty) (struct dquot *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 578) int (*write_info) (struct super_block *, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 579)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 580) These operations are intended to be more or less wrapping functions that ensure
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 581) a proper locking wrt the filesystem and call the generic quota operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 582)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 583) What filesystem should expect from the generic quota functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 584)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 585) ============== ============ =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 586) ops FS recursion Held locks when called
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 587) ============== ============ =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 588) write_dquot: yes dqonoff_sem or dqptr_sem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 589) acquire_dquot: yes dqonoff_sem or dqptr_sem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 590) release_dquot: yes dqonoff_sem or dqptr_sem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 591) mark_dirty: no -
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 592) write_info: yes dqonoff_sem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 593) ============== ============ =========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 594)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 595) FS recursion means calling ->quota_read() and ->quota_write() from superblock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 596) operations.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 597)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 598) More details about quota locking can be found in fs/dquot.c.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 599)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 600) vm_operations_struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 601) ====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 602)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 603) prototypes::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 604)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 605) void (*open)(struct vm_area_struct*);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 606) void (*close)(struct vm_area_struct*);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 607) vm_fault_t (*fault)(struct vm_area_struct*, struct vm_fault *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 608) vm_fault_t (*page_mkwrite)(struct vm_area_struct *, struct vm_fault *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 609) vm_fault_t (*pfn_mkwrite)(struct vm_area_struct *, struct vm_fault *);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 610) int (*access)(struct vm_area_struct *, unsigned long, void*, int, int);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 611)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 612) locking rules:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 613)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 614) ============= ========= ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 615) ops mmap_lock PageLocked(page)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 616) ============= ========= ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 617) open: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 618) close: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 619) fault: yes can return with page locked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 620) map_pages: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 621) page_mkwrite: yes can return with page locked
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 622) pfn_mkwrite: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 623) access: yes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 624) ============= ========= ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 625)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 626) ->fault() is called when a previously not present pte is about
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 627) to be faulted in. The filesystem must find and return the page associated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 628) with the passed in "pgoff" in the vm_fault structure. If it is possible that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 629) the page may be truncated and/or invalidated, then the filesystem must lock
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 630) the page, then ensure it is not already truncated (the page lock will block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 631) subsequent truncate), and then return with VM_FAULT_LOCKED, and the page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 632) locked. The VM will unlock the page.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 633)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 634) ->map_pages() is called when VM asks to map easy accessible pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 635) Filesystem should find and map pages associated with offsets from "start_pgoff"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 636) till "end_pgoff". ->map_pages() is called with page table locked and must
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 637) not block. If it's not possible to reach a page without blocking,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 638) filesystem should skip it. Filesystem should use do_set_pte() to setup
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 639) page table entry. Pointer to entry associated with the page is passed in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 640) "pte" field in vm_fault structure. Pointers to entries for other offsets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 641) should be calculated relative to "pte".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 642)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 643) ->page_mkwrite() is called when a previously read-only pte is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 644) about to become writeable. The filesystem again must ensure that there are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 645) no truncate/invalidate races, and then return with the page locked. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 646) the page has been truncated, the filesystem should not look up a new page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 647) like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 648) will cause the VM to retry the fault.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 649)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 650) ->pfn_mkwrite() is the same as page_mkwrite but when the pte is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 651) VM_PFNMAP or VM_MIXEDMAP with a page-less entry. Expected return is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 652) VM_FAULT_NOPAGE. Or one of the VM_FAULT_ERROR types. The default behavior
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 653) after this call is to make the pte read-write, unless pfn_mkwrite returns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 654) an error.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 655)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 656) ->access() is called when get_user_pages() fails in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 657) access_process_vm(), typically used to debug a process through
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 658) /proc/pid/mem or ptrace. This function is needed only for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 659) VM_IO | VM_PFNMAP VMAs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 660)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 661) --------------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 662)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 663) Dubious stuff
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 664)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 665) (if you break something or notice that it is broken and do not fix it yourself
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 666) - at least put it here)