Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  1) .. _ksm:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  3) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  4) Kernel Samepage Merging
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  5) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  7) KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  8) added to the Linux kernel in 2.6.32.  See ``mm/ksm.c`` for its implementation,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  9) and http://lwn.net/Articles/306704/ and https://lwn.net/Articles/330589/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) The userspace interface of KSM is described in :ref:`Documentation/admin-guide/mm/ksm.rst <admin_guide_ksm>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) Design
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) ======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) --------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) .. kernel-doc:: mm/ksm.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)    :DOC: Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) Reverse mapping
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) KSM maintains reverse mapping information for KSM pages in the stable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) tree.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) If a KSM page is shared between less than ``max_page_sharing`` VMAs,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) the node of the stable tree that represents such KSM page points to a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) list of struct rmap_item and the ``page->mapping`` of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) KSM page points to the stable tree node.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) When the sharing passes this threshold, KSM adds a second dimension to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) the stable tree. The tree node becomes a "chain" that links one or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) more "dups". Each "dup" keeps reverse mapping information for a KSM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) page with ``page->mapping`` pointing to that "dup".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) Every "chain" and all "dups" linked into a "chain" enforce the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) invariant that they represent the same write protected memory content,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) even if each "dup" will be pointed by a different KSM page copy of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) that content.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) This way the stable tree lookup computational complexity is unaffected
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) if compared to an unlimited list of reverse mappings. It is still
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) enforced that there cannot be KSM page content duplicates in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) stable tree itself.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) The deduplication limit enforced by ``max_page_sharing`` is required
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) to avoid the virtual memory rmap lists to grow too large. The rmap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) walk has O(N) complexity where N is the number of rmap_items
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) (i.e. virtual mappings) that are sharing the page, which is in turn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) capped by ``max_page_sharing``. So this effectively spreads the linear
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) O(N) computational complexity from rmap walk context over different
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) KSM pages. The ksmd walk over the stable_node "chains" is also O(N),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) but N is the number of stable_node "dups", not the number of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) rmap_items, so it has not a significant impact on ksmd performance. In
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) practice the best stable_node "dup" candidate will be kept and found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) at the head of the "dups" list.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) High values of ``max_page_sharing`` result in faster memory merging
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) (because there will be fewer stable_node dups queued into the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) stable_node chain->hlist to check for pruning) and higher
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) deduplication factor at the expense of slower worst case for rmap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) walks for any KSM page which can happen during swapping, compaction,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) NUMA balancing and page migration.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) The ``stable_node_dups/stable_node_chains`` ratio is also affected by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) ``max_page_sharing`` tunable, and an high ratio may indicate fragmentation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) in the stable_node dups, which could be solved by introducing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) fragmentation algorithms in ksmd which would refile rmap_items from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) one stable_node dup to another stable_node dup, in order to free up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) stable_node "dups" with few rmap_items in them, but that may increase
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) the ksmd CPU usage and possibly slowdown the readonly computations on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) the KSM pages of the applications.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) The whole list of stable_node "dups" linked in the stable_node
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) "chains" is scanned periodically in order to prune stale stable_nodes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) The frequency of such scans is defined by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) ``stable_node_chains_prune_millisecs`` sysfs tunable.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) Reference
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) .. kernel-doc:: mm/ksm.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)    :functions: mm_slot ksm_scan stable_node rmap_item
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) Izik Eidus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) Hugh Dickins, 17 Nov 2009