Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) .. _admin_guide_transhuge:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) Transparent Hugepage Support
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) ============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) Objective
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) Performance critical computing applications dealing with large memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) working sets are already running on top of libhugetlbfs and in turn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) hugetlbfs. Transparent HugePage Support (THP) is an alternative mean of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) using huge pages for the backing of virtual memory with huge pages
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) that supports the automatic promotion and demotion of page sizes and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) without the shortcomings of hugetlbfs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) Currently THP only works for anonymous memory mappings and tmpfs/shmem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) But in the future it can expand to other filesystems.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21)    in the examples below we presume that the basic page size is 4K and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22)    the huge page size is 2M, although the actual numbers may vary
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23)    depending on the CPU architecture.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) The reason applications are running faster is because of two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) factors. The first factor is almost completely irrelevant and it's not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) of significant interest because it'll also have the downside of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) requiring larger clear-page copy-page in page faults which is a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) potentially negative effect. The first factor consists in taking a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) single page fault for each 2M virtual region touched by userland (so
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) reducing the enter/exit kernel frequency by a 512 times factor). This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) only matters the first time the memory is accessed for the lifetime of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) a memory mapping. The second long lasting and much more important
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) factor will affect all subsequent accesses to the memory for the whole
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) runtime of the application. The second factor consist of two
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) components:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 1) the TLB miss will run faster (especially with virtualization using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39)    nested pagetables but almost always also on bare metal without
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40)    virtualization)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 2) a single TLB entry will be mapping a much larger amount of virtual
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43)    memory in turn reducing the number of TLB misses. With
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44)    virtualization and nested pagetables the TLB can be mapped of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45)    larger size only if both KVM and the Linux guest are using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46)    hugepages but a significant speedup already happens if only one of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47)    the two is using hugepages just because of the fact the TLB miss is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48)    going to run faster.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) THP can be enabled system wide or restricted to certain tasks or even
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) memory ranges inside task's address space. Unless THP is completely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) disabled, there is ``khugepaged`` daemon that scans memory and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) collapses sequences of basic pages into huge pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) The THP behaviour is controlled via :ref:`sysfs <thp_sysfs>`
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) interface and using madvise(2) and prctl(2) system calls.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) Transparent Hugepage Support maximizes the usefulness of free memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) if compared to the reservation approach of hugetlbfs by allowing all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) unused memory to be used as cache or other movable (or even unmovable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) entities). It doesn't require reservation to prevent hugepage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) allocation failures to be noticeable from userland. It allows paging
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) and all other advanced VM features to be available on the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) hugepages. It requires no modifications for applications to take
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) advantage of it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) Applications however can be further optimized to take advantage of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) this feature, like for example they've been optimized before to avoid
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) a flood of mmap system calls for every malloc(4k). Optimizing userland
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) is by far not mandatory and khugepaged already can take care of long
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) lived page allocations even for hugepage unaware applications that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) deals with large amounts of memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) In certain cases when hugepages are enabled system wide, application
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) may end up allocating more memory resources. An application may mmap a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) large region but only touch 1 byte of it, in that case a 2M page might
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) be allocated instead of a 4k page for no good. This is why it's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) possible to disable hugepages system-wide and to only have them inside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) MADV_HUGEPAGE madvise regions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) Embedded systems should enable hugepages only inside madvise regions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) to eliminate any risk of wasting any precious byte of memory and to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) only run faster.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) Applications that gets a lot of benefit from hugepages and that don't
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) risk to lose memory by using hugepages, should use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) madvise(MADV_HUGEPAGE) on their critical mmapped regions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) .. _thp_sysfs:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) sysfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) Global THP controls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) Transparent Hugepage Support for anonymous memory can be entirely disabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) regions (to avoid the risk of consuming more memory resources) or enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) system wide. This can be achieved with one of::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) 	echo always >/sys/kernel/mm/transparent_hugepage/enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 	echo madvise >/sys/kernel/mm/transparent_hugepage/enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 	echo never >/sys/kernel/mm/transparent_hugepage/enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) It's also possible to limit defrag efforts in the VM to generate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) anonymous hugepages in case they're not immediately free to madvise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) regions or to never try to defrag memory and simply fallback to regular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) pages unless hugepages are immediately available. Clearly if we spend CPU
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) time to defrag memory, we would expect to gain even more by the fact we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) use hugepages later instead of regular pages. This isn't always
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) guaranteed, but it may be more likely in case the allocation is for a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) MADV_HUGEPAGE region.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 	echo always >/sys/kernel/mm/transparent_hugepage/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 	echo defer >/sys/kernel/mm/transparent_hugepage/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 	echo defer+madvise >/sys/kernel/mm/transparent_hugepage/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 	echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 	echo never >/sys/kernel/mm/transparent_hugepage/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) always
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) 	means that an application requesting THP will stall on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 	allocation failure and directly reclaim pages and compact
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) 	memory in an effort to allocate a THP immediately. This may be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) 	desirable for virtual machines that benefit heavily from THP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 	use and are willing to delay the VM start to utilise them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) defer
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 	means that an application will wake kswapd in the background
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) 	to reclaim pages and wake kcompactd to compact memory so that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 	THP is available in the near future. It's the responsibility
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) 	of khugepaged to then install the THP pages later.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) defer+madvise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 	will enter direct reclaim and compaction like ``always``, but
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) 	only for regions that have used madvise(MADV_HUGEPAGE); all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 	other regions will wake kswapd in the background to reclaim
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 	pages and wake kcompactd to compact memory so that THP is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 	available in the near future.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) madvise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 	will enter direct reclaim like ``always`` but only for regions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 	that are have used madvise(MADV_HUGEPAGE). This is the default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) 	behaviour.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) never
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 	should be self-explanatory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) By default kernel tries to use huge zero page on read page fault to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) anonymous mapping. It's possible to disable huge zero page by writing 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) or enable it back by writing 1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) 	echo 0 >/sys/kernel/mm/transparent_hugepage/use_zero_page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 	echo 1 >/sys/kernel/mm/transparent_hugepage/use_zero_page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) Some userspace (such as a test program, or an optimized memory allocation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) library) may want to know the size (in bytes) of a transparent hugepage::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 	cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) khugepaged will be automatically started when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) transparent_hugepage/enabled is set to "always" or "madvise, and it'll
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) be automatically shutdown if it's set to "never".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) Khugepaged controls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) khugepaged runs usually at low frequency so while one may not want to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) invoke defrag algorithms synchronously during the page faults, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) should be worth invoking defrag at least in khugepaged. However it's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) also possible to disable defrag in khugepaged by writing 0 or enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) defrag in khugepaged by writing 1::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) 	echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) 	echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) You can also control how many pages khugepaged should scan at each
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) pass::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 	/sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) and how many milliseconds to wait in khugepaged between each pass (you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) can set this to 0 to run khugepaged at 100% utilization of one core)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 	/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) and how many milliseconds to wait in khugepaged if there's an hugepage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) allocation failure to throttle the next allocation attempt::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) 	/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) The khugepaged progress can be seen in the number of pages collapsed::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) 	/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) for each pass::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) 	/sys/kernel/mm/transparent_hugepage/khugepaged/full_scans
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) ``max_ptes_none`` specifies how many extra small pages (that are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) not already mapped) can be allocated when collapsing a group
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) of small pages into one large page::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) 	/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) A higher value leads to use additional memory for programs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) A lower value leads to gain less thp performance. Value of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) max_ptes_none can waste cpu time very little, you can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) ignore it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) ``max_ptes_swap`` specifies how many pages can be brought in from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) swap when collapsing a group of pages into a transparent huge page::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) 	/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) A higher value can cause excessive swap IO and waste
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) memory. A lower value can prevent THPs from being
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) collapsed, resulting fewer pages being collapsed into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) THPs, and lower memory access performance.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) ``max_ptes_shared`` specifies how many pages can be shared across multiple
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) processes. Exceeding the number would block the collapse::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) 	/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) A higher value may increase memory footprint for some workloads.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) Boot parameter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) ==============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) You can change the sysfs boot time defaults of Transparent Hugepage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) Support by passing the parameter ``transparent_hugepage=always`` or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) ``transparent_hugepage=madvise`` or ``transparent_hugepage=never``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) to the kernel command line.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) Hugepages in tmpfs/shmem
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) ========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) You can control hugepage allocation policy in tmpfs with mount option
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) ``huge=``. It can have following values:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) always
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)     Attempt to allocate huge pages every time we need a new page;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) never
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248)     Do not allocate huge pages;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) within_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251)     Only allocate huge page if it will be fully within i_size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252)     Also respect fadvise()/madvise() hints;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) advise
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255)     Only allocate huge pages if requested with fadvise()/madvise();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) The default policy is ``never``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) ``mount -o remount,huge= /mountpoint`` works fine after mount: remounting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) ``huge=never`` will not attempt to break up huge pages at all, just stop more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) from being allocated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) There's also sysfs knob to control hugepage allocation policy for internal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) shmem mount: /sys/kernel/mm/transparent_hugepage/shmem_enabled. The mount
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) is used for SysV SHM, memfds, shared anonymous mmaps (of /dev/zero or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) In addition to policies listed above, shmem_enabled allows two further
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) values:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) deny
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)     For use in emergencies, to force the huge option off from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)     all mounts;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) force
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275)     Force the huge option on for all - very useful for testing;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) Need of application restart
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) The transparent_hugepage/enabled values and tmpfs mount option only affect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) future behavior. So to make them effective you need to restart any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) application that could have been using hugepages. This also applies to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) regions registered in khugepaged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) Monitoring usage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) ================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) The number of anonymous transparent huge pages currently used by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) system is available by reading the AnonHugePages field in ``/proc/meminfo``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) To identify what applications are using anonymous transparent huge pages,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) it is necessary to read ``/proc/PID/smaps`` and count the AnonHugePages fields
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) for each mapping.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) The number of file transparent huge pages mapped to userspace is available
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) by reading ShmemPmdMapped and ShmemHugePages fields in ``/proc/meminfo``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) To identify what applications are mapping file transparent huge pages, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) is necessary to read ``/proc/PID/smaps`` and count the FileHugeMapped fields
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) for each mapping.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) Note that reading the smaps file is expensive and reading it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) frequently will incur overhead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) There are a number of counters in ``/proc/vmstat`` that may be used to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) monitor how successfully the system is providing huge pages for use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) thp_fault_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) 	is incremented every time a huge page is successfully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) 	allocated to handle a page fault.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) thp_collapse_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) 	is incremented by khugepaged when it has found
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) 	a range of pages to collapse into one huge page and has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) 	successfully allocated a new huge page to store the data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) thp_fault_fallback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) 	is incremented if a page fault fails to allocate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) 	a huge page and instead falls back to using small pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) thp_fault_fallback_charge
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) 	is incremented if a page fault fails to charge a huge page and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) 	instead falls back to using small pages even though the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) 	allocation was successful.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) thp_collapse_alloc_failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) 	is incremented if khugepaged found a range
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) 	of pages that should be collapsed into one huge page but failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) 	the allocation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) thp_file_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) 	is incremented every time a file huge page is successfully
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 	allocated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) thp_file_fallback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) 	is incremented if a file huge page is attempted to be allocated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) 	but fails and instead falls back to using small pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) thp_file_fallback_charge
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) 	is incremented if a file huge page cannot be charged and instead
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) 	falls back to using small pages even though the allocation was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) 	successful.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) thp_file_mapped
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) 	is incremented every time a file huge page is mapped into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) 	user address space.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) thp_split_page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) 	is incremented every time a huge page is split into base
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) 	pages. This can happen for a variety of reasons but a common
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 	reason is that a huge page is old and is being reclaimed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) 	This action implies splitting all PMD the page mapped with.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) thp_split_page_failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) 	is incremented if kernel fails to split huge
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) 	page. This can happen if the page was pinned by somebody.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) thp_deferred_split_page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) 	is incremented when a huge page is put onto split
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) 	queue. This happens when a huge page is partially unmapped and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) 	splitting it would free up some memory. Pages on split queue are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 	going to be split under memory pressure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) thp_split_pmd
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) 	is incremented every time a PMD split into table of PTEs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 	This can happen, for instance, when application calls mprotect() or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) 	munmap() on part of huge page. It doesn't split huge page, only
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 	page table entry.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) thp_zero_page_alloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) 	is incremented every time a huge zero page is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) 	successfully allocated. It includes allocations which where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) 	dropped due race with other allocation. Note, it doesn't count
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) 	every map of the huge zero page, only its allocation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) thp_zero_page_alloc_failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) 	is incremented if kernel fails to allocate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376) 	huge zero page and falls back to using small pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378) thp_swpout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) 	is incremented every time a huge page is swapout in one
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) 	piece without splitting.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) thp_swpout_fallback
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383) 	is incremented if a huge page has to be split before swapout.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) 	Usually because failed to allocate some continuous swap space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) 	for the huge page.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) As the system ages, allocating huge pages may be expensive as the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) system uses memory compaction to copy data around memory to free a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) huge page for use. There are some counters in ``/proc/vmstat`` to help
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390) monitor this overhead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392) compact_stall
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) 	is incremented every time a process stalls to run
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) 	memory compaction so that a huge page is free for use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) compact_success
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) 	is incremented if the system compacted memory and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398) 	freed a huge page for use.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) compact_fail
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401) 	is incremented if the system tries to compact memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) 	but failed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404) compact_pages_moved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405) 	is incremented each time a page is moved. If
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) 	this value is increasing rapidly, it implies that the system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) 	is copying a lot of data to satisfy the huge page allocation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408) 	It is possible that the cost of copying exceeds any savings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) 	from reduced TLB misses.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411) compact_pagemigrate_failed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) 	is incremented when the underlying mechanism
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) 	for moving a page failed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) compact_blocks_moved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) 	is incremented each time memory compaction examines
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417) 	a huge page aligned range of pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419) It is possible to establish how long the stalls were using the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) tracer to record how long was spent in __alloc_pages_nodemask and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) using the mm_page_alloc tracepoint to identify which allocations were
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422) for huge pages.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) Optimizing the applications
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425) ===========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427) To be guaranteed that the kernel will map a 2M page immediately in any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) memory region, the mmap region has to be hugepage naturally
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429) aligned. posix_memalign() can provide that guarantee.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431) Hugetlbfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) You can use hugetlbfs on a kernel that has transparent hugepage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435) support enabled just fine as always. No difference can be noted in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) hugetlbfs other than there will be less overall fragmentation. All
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) usual features belonging to hugetlbfs are preserved and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438) unaffected. libhugetlbfs will also work fine as usual.