Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  1) .. SPDX-License-Identifier: GPL-2.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  3) ===============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  4) Inotify - A Powerful yet Simple File Change Notification System
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  5) ===============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  7) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  9) Document started 15 Mar 2005 by Robert Love <rml@novell.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) Document updated 4 Jan 2015 by Zhang Zhen <zhenzhang.zhang@huawei.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) 	- Deleted obsoleted interface, just refer to manpages for user interface.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) (i) Rationale
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Q:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)    What is the design decision behind not tying the watch to the open fd of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)    the watched object?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) A:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)    Watches are associated with an open inotify device, not an open file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)    This solves the primary problem with dnotify: keeping the file open pins
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24)    the file and thus, worse, pins the mount.  Dnotify is therefore infeasible
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)    for use on a desktop system with removable media as the media cannot be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)    unmounted.  Watching a file should not require that it be open.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) Q:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29)    What is the design decision behind using an-fd-per-instance as opposed to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)    an fd-per-watch?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) A:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)    An fd-per-watch quickly consumes more file descriptors than are allowed,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)    more fd's than are feasible to manage, and more fd's than are optimally
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)    select()-able.  Yes, root can bump the per-process fd limit and yes, users
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36)    can use epoll, but requiring both is a silly and extraneous requirement.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)    A watch consumes less memory than an open file, separating the number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)    spaces is thus sensible.  The current design is what user-space developers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)    want: Users initialize inotify, once, and add n watches, requiring but one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40)    fd and no twiddling with fd limits.  Initializing an inotify instance two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41)    thousand times is silly.  If we can implement user-space's preferences
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)    cleanly--and we can, the idr layer makes stuff like this trivial--then we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)    should.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45)    There are other good arguments.  With a single fd, there is a single
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46)    item to block on, which is mapped to a single queue of events.  The single
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)    fd returns all watch events and also any potential out-of-band data.  If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)    every fd was a separate watch,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)    - There would be no way to get event ordering.  Events on file foo and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)      file bar would pop poll() on both fd's, but there would be no way to tell
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)      which happened first.  A single queue trivially gives you ordering.  Such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53)      ordering is crucial to existing applications such as Beagle.  Imagine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)      "mv a b ; mv b a" events without ordering.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)    - We'd have to maintain n fd's and n internal queues with state,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57)      versus just one.  It is a lot messier in the kernel.  A single, linear
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58)      queue is the data structure that makes sense.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)    - User-space developers prefer the current API.  The Beagle guys, for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)      example, love it.  Trust me, I asked.  It is not a surprise: Who'd want
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62)      to manage and block on 1000 fd's via select?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64)    - No way to get out of band data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)    - 1024 is still too low.  ;-)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)    When you talk about designing a file change notification system that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69)    scales to 1000s of directories, juggling 1000s of fd's just does not seem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)    the right interface.  It is too heavy.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)    Additionally, it _is_ possible to  more than one instance  and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73)    juggle more than one queue and thus more than one associated fd.  There
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74)    need not be a one-fd-per-process mapping; it is one-fd-per-queue and a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)    process can easily want more than one queue.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) Q:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)    Why the system call approach?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) A:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)    The poor user-space interface is the second biggest problem with dnotify.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82)    Signals are a terrible, terrible interface for file notification.  Or for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)    anything, for that matter.  The ideal solution, from all perspectives, is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84)    file descriptor-based one that allows basic file I/O and poll/select.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)    Obtaining the fd and managing the watches could have been done either via a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86)    device file or a family of new system calls.  We decided to implement a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)    family of system calls because that is the preferred approach for new kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)    interfaces.  The only real difference was whether we wanted to use open(2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)    and ioctl(2) or a couple of new system calls.  System calls beat ioctls.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)