Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  1) .. _page_owner:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  2) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  3) ==================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  4) page owner: Tracking about who allocated each page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  5) ==================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  7) Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  8) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) page owner is for the tracking about who allocated each page.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) It can be used to debug memory leak or to find a memory hogger.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) When allocation happens, information about allocation such as call stack
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) and order of pages is stored into certain storage for each page.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) When we need to know about status of all pages, we can get and analyze
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) this information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) Although we already have tracepoint for tracing page allocation/free,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) using it for analyzing who allocate each page is rather complex. We need
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) to enlarge the trace buffer for preventing overlapping until userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) program launched. And, launched program continually dump out the trace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) buffer for later analysis and it would change system behaviour with more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) possibility rather than just keeping it in memory, so bad for debugging.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) page owner can also be used for various purposes. For example, accurate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25) fragmentation statistics can be obtained through gfp flag information of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) each page. It is already implemented and activated if page owner is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) enabled. Other usages are more than welcome.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) page owner is disabled in default. So, if you'd like to use it, you need
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) to add "page_owner=on" into your boot cmdline. If the kernel is built
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) with page owner and page owner is disabled in runtime due to no enabling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) boot option, runtime overhead is marginal. If disabled in runtime, it
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) doesn't require memory to store owner information, so there is no runtime
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) memory overhead. And, page owner inserts just two unlikely branches into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) the page allocator hotpath and if not enabled, then allocation is done
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) like as the kernel without page owner. These two unlikely branches should
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) not affect to allocation performance, especially if the static keys jump
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) label patching functionality is available. Following is the kernel's code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) size change due to this facility.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) - Without page owner::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)    text    data     bss     dec     hex filename
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44)    48392   2333     644   51369    c8a9 mm/page_alloc.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) - With page owner::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48)    text    data     bss     dec     hex filename
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)    48800   2445     644   51889    cab1 mm/page_alloc.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)    6662     108      29    6799    1a8f mm/page_owner.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)    1025       8       8    1041     411 mm/page_ext.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) Although, roughly, 8 KB code is added in total, page_alloc.o increase by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) 520 bytes and less than half of it is in hotpath. Building the kernel with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) page owner and turning it on if needed would be great option to debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) kernel memory problem.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) There is one notice that is caused by implementation detail. page owner
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) stores information into the memory from struct page extension. This memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) is initialized some time later than that page allocator starts in sparse
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) memory system, so, until initialization, many pages can be allocated and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) they would have no owner information. To fix it up, these early allocated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) pages are investigated and marked as allocated in initialization phase.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) Although it doesn't mean that they have the right owner information,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) at least, we can tell whether the page is allocated or not,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) more accurately. On 2GB memory x86-64 VM box, 13343 early allocated pages
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) are catched and marked, although they are mostly allocated from struct
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) page extension feature. Anyway, after that, no page is left in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) un-tracking state.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) Usage
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) =====
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) 1) Build user-space helper::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) 	cd tools/vm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) 	make page_owner_sort
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) 2) Enable page owner: add "page_owner=on" to boot cmdline.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) 3) Do the job what you want to debug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) 4) Analyze information from page owner::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) 	cat /sys/kernel/debug/page_owner > page_owner_full.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) 	./page_owner_sort page_owner_full.txt sorted_page_owner.txt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)    See the result about who allocated each page
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89)    in the ``sorted_page_owner.txt``.