Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) Immutable biovecs and biovec iterators
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ======================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) Kent Overstreet <kmo@daterainc.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) As of 3.13, biovecs should never be modified after a bio has been submitted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) Instead, we have a new struct bvec_iter which represents a range of a biovec -
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) the iterator will be modified as the bio is completed, not the biovec.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) More specifically, old code that needed to partially complete a bio would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) update bi_sector and bi_size, and advance bi_idx to the next biovec. If it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) ended up partway through a biovec, it would increment bv_offset and decrement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) bv_len by the number of bytes completed in that biovec.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) In the new scheme of things, everything that must be mutated in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) partially complete a bio is segregated into struct bvec_iter: bi_sector,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) bi_size and bi_idx have been moved there; and instead of modifying bv_offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) and bv_len, struct bvec_iter has bi_bvec_done, which represents the number of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) bytes completed in the current bvec.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) There are a bunch of new helper macros for hiding the gory details - in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) particular, presenting the illusion of partially completed biovecs so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) normal code doesn't have to deal with bi_bvec_done.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26)  * Driver code should no longer refer to biovecs directly; we now have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27)    bio_iovec() and bio_iter_iovec() macros that return literal struct biovecs,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28)    constructed from the raw biovecs but taking into account bi_bvec_done and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29)    bi_size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31)    bio_for_each_segment() has been updated to take a bvec_iter argument
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32)    instead of an integer (that corresponded to bi_idx); for a lot of code the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33)    conversion just required changing the types of the arguments to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34)    bio_for_each_segment().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)  * Advancing a bvec_iter is done with bio_advance_iter(); bio_advance() is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37)    wrapper around bio_advance_iter() that operates on bio->bi_iter, and also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38)    advances the bio integrity's iter if present.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40)    There is a lower level advance function - bvec_iter_advance() - which takes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41)    a pointer to a biovec, not a bio; this is used by the bio integrity code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) What's all this get us?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) Having a real iterator, and making biovecs immutable, has a number of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) advantages:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49)  * Before, iterating over bios was very awkward when you weren't processing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50)    exactly one bvec at a time - for example, bio_copy_data() in block/bio.c,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51)    which copies the contents of one bio into another. Because the biovecs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52)    wouldn't necessarily be the same size, the old code was tricky convoluted -
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53)    it had to walk two different bios at the same time, keeping both bi_idx and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54)    and offset into the current biovec for each.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56)    The new code is much more straightforward - have a look. This sort of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57)    pattern comes up in a lot of places; a lot of drivers were essentially open
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58)    coding bvec iterators before, and having common implementation considerably
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59)    simplifies a lot of code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61)  * Before, any code that might need to use the biovec after the bio had been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62)    completed (perhaps to copy the data somewhere else, or perhaps to resubmit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63)    it somewhere else if there was an error) had to save the entire bvec array
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64)    - again, this was being done in a fair number of places.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66)  * Biovecs can be shared between multiple bios - a bvec iter can represent an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67)    arbitrary range of an existing biovec, both starting and ending midway
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68)    through biovecs. This is what enables efficient splitting of arbitrary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69)    bios. Note that this means we _only_ use bi_size to determine when we've
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70)    reached the end of a bio, not bi_vcnt - and the bio_iovec() macro takes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71)    bi_size into account when constructing biovecs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73)  * Splitting bios is now much simpler. The old bio_split() didn't even work on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74)    bios with more than a single bvec! Now, we can efficiently split arbitrary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75)    size bios - because the new bio can share the old bio's biovec.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77)    Care must be taken to ensure the biovec isn't freed while the split bio is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78)    still using it, in case the original bio completes first, though. Using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79)    bio_chain() when splitting bios helps with this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81)  * Submitting partially completed bios is now perfectly fine - this comes up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82)    occasionally in stacking block drivers and various code (e.g. md and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83)    bcache) had some ugly workarounds for this.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85)    It used to be the case that submitting a partially completed bio would work
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86)    fine to _most_ devices, but since accessing the raw bvec array was the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87)    norm, not all drivers would respect bi_idx and those would break. Now,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88)    since all drivers _must_ go through the bvec iterator - and have been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89)    audited to make sure they are - submitting partially completed bios is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90)    perfectly fine.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) Other implications:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) ===================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95)  * Almost all usage of bi_idx is now incorrect and has been removed; instead,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96)    where previously you would have used bi_idx you'd now use a bvec_iter,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97)    probably passing it to one of the helper macros.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99)    I.e. instead of using bio_iovec_idx() (or bio->bi_iovec[bio->bi_idx]), you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)    now use bio_iter_iovec(), which takes a bvec_iter and returns a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)    literal struct bio_vec - constructed on the fly from the raw biovec but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)    taking into account bi_bvec_done (and bi_size).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104)  * bi_vcnt can't be trusted or relied upon by driver code - i.e. anything that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)    doesn't actually own the bio. The reason is twofold: firstly, it's not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)    actually needed for iterating over the bio anymore - we only use bi_size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)    Secondly, when cloning a bio and reusing (a portion of) the original bio's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)    biovec, in order to calculate bi_vcnt for the new bio we'd have to iterate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)    over all the biovecs in the new bio - which is silly as it's not needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)    So, don't use bi_vcnt anymore.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)  * The current interface allows the block layer to split bios as needed, so we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)    could eliminate a lot of complexity particularly in stacked drivers. Code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)    that creates bios can then create whatever size bios are convenient, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)    more importantly stacked drivers don't have to deal with both their own bio
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)    size limitations and the limitations of the underlying devices. Thus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)    there's no need to define ->merge_bvec_fn() callbacks for individual block
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119)    drivers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) Usage of helpers:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) =================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) * The following helpers whose names have the suffix of `_all` can only be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)   on non-BIO_CLONED bio. They are usually used by filesystem code. Drivers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)   shouldn't use them because the bio may have been split before it reached the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)   driver.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 	bio_for_each_segment_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 	bio_for_each_bvec_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 	bio_first_bvec_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 	bio_first_page_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 	bio_last_bvec_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) * The following helpers iterate over single-page segment. The passed 'struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)   bio_vec' will contain a single-page IO vector during the iteration::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 	bio_for_each_segment()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 	bio_for_each_segment_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) * The following helpers iterate over multi-page bvec. The passed 'struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)   bio_vec' will contain a multi-page IO vector during the iteration::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 	bio_for_each_bvec()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 	bio_for_each_bvec_all()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 	rq_for_each_bvec()