Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ==============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) Memory Layout on AArch64 Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ==============================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) Author: Catalin Marinas <catalin.marinas@arm.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) This document describes the virtual memory layout used by the AArch64
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) Linux kernel. The architecture allows up to 4 levels of translation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) tables with a 4KB page size and up to 3 levels with a 64KB page size.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) AArch64 Linux uses either 3 levels or 4 levels of translation tables
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) (256TB) virtual addresses, respectively, for both user and kernel. With
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 64KB pages, only 2 levels of translation tables, allowing 42-bit (4TB)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) virtual address, are used but the memory layout is the same.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) ARMv8.2 adds optional support for Large Virtual Address space. This is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) only available when running with a 64KB page size and expands the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) number of descriptors in the first level of translation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) User addresses have bits 63:48 set to 0 while the kernel addresses have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) the same bits set to 1. TTBRx selection is given by bit 63 of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) virtual address. The swapper_pg_dir contains only kernel (global)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) mappings while the user pgd contains only user (non-global) mappings.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) The swapper_pg_dir address is written to TTBR1 and never written to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) TTBR0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31)   Start			End			Size		Use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32)   -----------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33)   0000000000000000	0000ffffffffffff	 256TB		user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34)   ffff000000000000	ffff7fffffffffff	 128TB		kernel logical memory map
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35)   ffff800000000000	ffff9fffffffffff	  32TB		kasan shadow region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)   ffffa00000000000	ffffa00007ffffff	 128MB		bpf jit region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37)   ffffa00008000000	ffffa0000fffffff	 128MB		modules
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38)   ffffa00010000000	fffffdffbffeffff	 ~93TB		vmalloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39)   fffffdffbfff0000	fffffdfffe5f8fff	~998MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40)   fffffdfffe5f9000	fffffdfffe9fffff	4124KB		fixed mappings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41)   fffffdfffea00000	fffffdfffebfffff	   2MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42)   fffffdfffec00000	fffffdffffbfffff	  16MB		PCI I/O space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43)   fffffdffffc00000	fffffdffffdfffff	   2MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44)   fffffdffffe00000	ffffffffffdfffff	   2TB		vmemmap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45)   ffffffffffe00000	ffffffffffffffff	   2MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50)   Start			End			Size		Use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51)   -----------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52)   0000000000000000	000fffffffffffff	   4PB		user
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53)   fff0000000000000	fff7ffffffffffff	   2PB		kernel logical memory map
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54)   fff8000000000000	fffd9fffffffffff	1440TB		[gap]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55)   fffda00000000000	ffff9fffffffffff	 512TB		kasan shadow region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56)   ffffa00000000000	ffffa00007ffffff	 128MB		bpf jit region
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57)   ffffa00008000000	ffffa0000fffffff	 128MB		modules
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58)   ffffa00010000000	fffff81ffffeffff	 ~88TB		vmalloc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59)   fffff81fffff0000	fffffc1ffe58ffff	  ~3TB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60)   fffffc1ffe590000	fffffc1ffe9fffff	4544KB		fixed mappings
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61)   fffffc1ffea00000	fffffc1ffebfffff	   2MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62)   fffffc1ffec00000	fffffc1fffbfffff	  16MB		PCI I/O space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63)   fffffc1fffc00000	fffffc1fffdfffff	   2MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64)   fffffc1fffe00000	ffffffffffdfffff	3968GB		vmemmap
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65)   ffffffffffe00000	ffffffffffffffff	   2MB		[guard region]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) Translation table lookup with 4KB pages::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70)   +--------+--------+--------+--------+--------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71)   |63    56|55    48|47    40|39    32|31    24|23    16|15     8|7      0|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72)   +--------+--------+--------+--------+--------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73)    |                 |         |         |         |         |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74)    |                 |         |         |         |         v
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75)    |                 |         |         |         |   [11:0]  in-page offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76)    |                 |         |         |         +-> [20:12] L3 index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77)    |                 |         |         +-----------> [29:21] L2 index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78)    |                 |         +---------------------> [38:30] L1 index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79)    |                 +-------------------------------> [47:39] L0 index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80)    +-------------------------------------------------> [63] TTBR0/1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) Translation table lookup with 64KB pages::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85)   +--------+--------+--------+--------+--------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86)   |63    56|55    48|47    40|39    32|31    24|23    16|15     8|7      0|
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87)   +--------+--------+--------+--------+--------+--------+--------+--------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88)    |                 |    |               |              |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89)    |                 |    |               |              v
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90)    |                 |    |               |            [15:0]  in-page offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91)    |                 |    |               +----------> [28:16] L3 index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92)    |                 |    +--------------------------> [41:29] L2 index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93)    |                 +-------------------------------> [47:42] L1 index (48-bit)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94)    |                                                   [51:42] L1 index (52-bit)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95)    +-------------------------------------------------> [63] TTBR0/1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) When using KVM without the Virtualization Host Extensions, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) hypervisor maps kernel pages in EL2 at a fixed (and potentially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) random) offset from the linear mapping. See the kern_hyp_va macro and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) kvm_update_va_mask function for more details. MMIO devices such as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) GICv2 gets mapped next to the HYP idmap page, as do vectors when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) ARM64_SPECTRE_V3A is enabled for particular CPUs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) When using KVM with the Virtualization Host Extensions, no additional
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) mappings are created, since the host kernel runs directly in EL2.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) 52-bit VA support in the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) -------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) If the ARMv8.2-LVA optional feature is present, and we are running
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) with a 64KB page size; then it is possible to use 52-bits of address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) space for both userspace and kernel addresses. However, any kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) binary that supports 52-bit must also be able to fall back to 48-bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) at early boot time if the hardware feature is not present.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) This fallback mechanism necessitates the kernel .text to be in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) higher addresses such that they are invariant to 48/52-bit VAs. Due
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) to the kasan shadow being a fraction of the entire kernel VA space,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) the end of the kasan shadow must also be in the higher half of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) kernel VA space for both 48/52-bit. (Switching from 48-bit to 52-bit,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) the end of the kasan shadow is invariant and dependent on ~0UL,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) whilst the start address will "grow" towards the lower addresses).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) In order to optimise phys_to_virt and virt_to_phys, the PAGE_OFFSET
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) is kept constant at 0xFFF0000000000000 (corresponding to 52-bit),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) this obviates the need for an extra variable read. The physvirt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) offset and vmemmap offsets are computed at early boot to enable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) this logic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) As a single binary will need to support both 48-bit and 52-bit VA
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) also must be sized large enough to accommodate a fixed PAGE_OFFSET.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) Most code in the kernel should not need to consider the VA_BITS, for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) code that does need to know the VA size the variables are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) defined as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) VA_BITS		constant	the *maximum* VA space size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) VA_BITS_MIN	constant	the *minimum* VA space size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) vabits_actual	variable	the *actual* VA space size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) Maximum and minimum sizes can be useful to ensure that buffers are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) sized large enough or that addresses are positioned close enough for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) the "worst" case.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 52-bit userspace VAs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) To maintain compatibility with software that relies on the ARMv8.0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) VA space maximum size of 48-bits, the kernel will, by default,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) return virtual addresses to userspace from a 48-bit range.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) Software can "opt-in" to receiving VAs from a 52-bit space by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) specifying an mmap hint parameter that is larger than 48-bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) For example:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) .. code-block:: c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)    maybe_high_address = mmap(~0UL, size, prot, flags,...);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) It is also possible to build a debug kernel that returns addresses
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) from a 52-bit space by enabling the following kernel config options:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) .. code-block:: sh
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)    CONFIG_EXPERT=y && CONFIG_ARM64_FORCE_52BIT=y
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) Note that this option is only intended for debugging applications
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) and should not be used in production.