Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. _zswap:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) zswap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) Overview
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) ========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) Zswap is a lightweight compressed cache for swap pages. It takes pages that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) in the process of being swapped out and attempts to compress them into a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) dynamically allocated RAM-based memory pool.  zswap basically trades CPU cycles
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) for potentially reduced swap I/O.  This trade-off can also result in a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) significant performance improvement if reads from the compressed cache are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) faster than reads from a swap device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18)    Zswap is a new feature as of v3.11 and interacts heavily with memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19)    reclaim.  This interaction has not been fully explored on the large set of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20)    potential configurations and workloads that exist.  For this reason, zswap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21)    is a work in progress and should be considered experimental.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23)    Some potential benefits:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) * Desktop/laptop users with limited RAM capacities can mitigate the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26)   performance impact of swapping.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) * Overcommitted guests that share a common I/O resource can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28)   dramatically reduce their swap I/O pressure, avoiding heavy handed I/O
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29)   throttling by the hypervisor. This allows more work to get done with less
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30)   impact to the guest workload and guests sharing the I/O subsystem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) * Users with SSDs as swap devices can extend the life of the device by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32)   drastically reducing life-shortening writes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) Zswap evicts pages from compressed cache on an LRU basis to the backing swap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) device when the compressed pool reaches its size limit.  This requirement had
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) been identified in prior community discussions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) Whether Zswap is enabled at the boot time depends on whether
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) the ``CONFIG_ZSWAP_DEFAULT_ON`` Kconfig option is enabled or not.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) This setting can then be overridden by providing the kernel command line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) ``zswap.enabled=`` option, for example ``zswap.enabled=0``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) Zswap can also be enabled and disabled at runtime using the sysfs interface.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) An example command to enable zswap at runtime, assuming sysfs is mounted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) at ``/sys``, is::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 	echo 1 > /sys/module/zswap/parameters/enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) When zswap is disabled at runtime it will stop storing pages that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) being swapped out.  However, it will _not_ immediately write out or fault
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) back into memory all of the pages stored in the compressed pool.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) pages stored in zswap will remain in the compressed pool until they are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) either invalidated or faulted back into memory.  In order to force all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) pages out of the compressed pool, a swapoff on the swap device(s) will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) fault back into memory all swapped out pages, including those in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) compressed pool.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) Design
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) ======
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) Zswap receives pages for compression through the Frontswap API and is able to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) evict pages from its own compressed pool on an LRU basis and write them back to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) the backing swap device in the case that the compressed pool is full.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) Zswap makes use of zpool for the managing the compressed memory pool.  Each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) allocation in zpool is not directly accessible by address.  Rather, a handle is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) returned by the allocation routine and that handle must be mapped before being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) accessed.  The compressed memory pool grows on demand and shrinks as compressed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) pages are freed.  The pool is not preallocated.  By default, a zpool
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) of type selected in ``CONFIG_ZSWAP_ZPOOL_DEFAULT`` Kconfig option is created,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) but it can be overridden at boot time by setting the ``zpool`` attribute,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) e.g. ``zswap.zpool=zbud``. It can also be changed at runtime using the sysfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) ``zpool`` attribute, e.g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 	echo zbud > /sys/module/zswap/parameters/zpool
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) The zbud type zpool allocates exactly 1 page to store 2 compressed pages, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) means the compression ratio will always be 2:1 or worse (because of half-full
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) zbud pages).  The zsmalloc type zpool has a more complex compressed page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) storage method, and it can achieve greater storage densities.  However,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) zsmalloc does not implement compressed page eviction, so once zswap fills it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) cannot evict the oldest page, it can only reject new pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) When a swap page is passed from frontswap to zswap, zswap maintains a mapping
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) of the swap entry, a combination of the swap type and swap offset, to the zpool
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) handle that references that compressed swap page.  This mapping is achieved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) with a red-black tree per swap type.  The swap offset is the search key for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) tree nodes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) During a page fault on a PTE that is a swap entry, frontswap calls the zswap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) load function to decompress the page into the page allocated by the page fault
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) handler.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) Once there are no PTEs referencing a swap page stored in zswap (i.e. the count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) in the swap_map goes to 0) the swap code calls the zswap invalidate function,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) via frontswap, to free the compressed entry.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) Zswap seeks to be simple in its policies.  Sysfs attributes allow for one user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) controlled policy:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) * max_pool_percent - The maximum percentage of memory that the compressed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)   pool can occupy.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) The default compressor is selected in ``CONFIG_ZSWAP_COMPRESSOR_DEFAULT``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) Kconfig option, but it can be overridden at boot time by setting the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) ``compressor`` attribute, e.g. ``zswap.compressor=lzo``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) It can also be changed at runtime using the sysfs "compressor"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) attribute, e.g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 	echo lzo > /sys/module/zswap/parameters/compressor
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) When the zpool and/or compressor parameter is changed at runtime, any existing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) compressed pages are not modified; they are left in their own zpool.  When a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) request is made for a page in an old zpool, it is uncompressed using its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) original compressor.  Once all pages are removed from an old zpool, the zpool
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) and its compressor are freed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) Some of the pages in zswap are same-value filled pages (i.e. contents of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) page have same value or repetitive pattern). These pages include zero-filled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) pages and they are handled differently. During store operation, a page is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) checked if it is a same-value filled page before compressing it. If true, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) compressed length of the page is set to zero and the pattern or same-filled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) value is stored.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) Same-value filled pages identification feature is enabled by default and can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) disabled at boot time by setting the ``same_filled_pages_enabled`` attribute
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) to 0, e.g. ``zswap.same_filled_pages_enabled=0``. It can also be enabled and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) disabled at runtime using the sysfs ``same_filled_pages_enabled``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) attribute, e.g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 	echo 1 > /sys/module/zswap/parameters/same_filled_pages_enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) When zswap same-filled page identification is disabled at runtime, it will stop
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) checking for the same-value filled pages during store operation. However, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) existing pages which are marked as same-value filled pages remain stored
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) unchanged in zswap until they are either loaded or invalidated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) To prevent zswap from shrinking pool when zswap is full and there's a high
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) pressure on swap (this will result in flipping pages in and out zswap pool
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) without any real benefit but with a performance drop for the system), a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) special parameter has been introduced to implement a sort of hysteresis to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) refuse taking pages into zswap pool until it has sufficient space if the limit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) has been hit. To set the threshold at which zswap would start accepting pages
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) again after it became full, use the sysfs ``accept_threshold_percent``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) attribute, e. g.::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 	echo 80 > /sys/module/zswap/parameters/accept_threshold_percent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) Setting this parameter to 100 will disable the hysteresis.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) A debugfs interface is provided for various statistic about pool size, number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) of pages stored, same-value filled pages and various counters for the reasons
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) pages are rejected.