^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) ===================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Scalable Vector Extension support for AArch64 Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) ===================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) Author: Dave Martin <Dave.Martin@arm.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) Date: 4 August 2017
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) This document outlines briefly the interface provided to userspace by Linux in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) order to support use of the ARM Scalable Vector Extension (SVE).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) This is an outline of the most important features and issues only and not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13) intended to be exhaustive.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) This document does not aim to describe the SVE architecture or programmer's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16) model. To aid understanding, a minimal description of relevant programmer's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) model features for SVE is included in Appendix A.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) 1. General
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23) * SVE registers Z0..Z31, P0..P15 and FFR and the current vector length VL, are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) tracked per-thread.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) * The presence of SVE is reported to userspace via HWCAP_SVE in the aux vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) AT_HWCAP entry. Presence of this flag implies the presence of the SVE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) instructions and registers, and the Linux-specific system interfaces
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) described in this document. SVE is reported in /proc/cpuinfo as "sve".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) * Support for the execution of SVE instructions in userspace can also be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) detected by reading the CPU ID register ID_AA64PFR0_EL1 using an MRS
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) instruction, and checking that the value of the SVE field is nonzero. [3]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) It does not guarantee the presence of the system interfaces described in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) following sections: software that needs to verify that those interfaces are
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) present must check for HWCAP_SVE instead.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) * On hardware that supports the SVE2 extensions, HWCAP2_SVE2 will also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) be reported in the AT_HWCAP2 aux vector entry. In addition to this,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) optional extensions to SVE2 may be reported by the presence of:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) HWCAP2_SVE2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) HWCAP2_SVEAES
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) HWCAP2_SVEPMULL
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) HWCAP2_SVEBITPERM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47) HWCAP2_SVESHA3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) HWCAP2_SVESM4
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) This list may be extended over time as the SVE architecture evolves.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) These extensions are also reported via the CPU ID register ID_AA64ZFR0_EL1,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) which userspace can read using an MRS instruction. See elf_hwcaps.txt and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) cpu-feature-registers.txt for details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) * Debuggers should restrict themselves to interacting with the target via the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) NT_ARM_SVE regset. The recommended way of detecting support for this regset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) is to connect to a target process first and then attempt a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) ptrace(PTRACE_GETREGSET, pid, NT_ARM_SVE, &iov).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) * Whenever SVE scalable register values (Zn, Pn, FFR) are exchanged in memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) between userspace and the kernel, the register value is encoded in memory in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) an endianness-invariant layout, with bits [(8 * i + 7) : (8 * i)] encoded at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) byte offset i from the start of the memory representation. This affects for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) example the signal frame (struct sve_context) and ptrace interface
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) (struct user_sve_header) and associated data.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68) Beware that on big-endian systems this results in a different byte order than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) for the FPSIMD V-registers, which are stored as single host-endian 128-bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) values, with bits [(127 - 8 * i) : (120 - 8 * i)] of the register encoded at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) byte offset i. (struct fpsimd_context, struct user_fpsimd_state).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) 2. Vector length terminology
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) -----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) The size of an SVE vector (Z) register is referred to as the "vector length".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) To avoid confusion about the units used to express vector length, the kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) adopts the following conventions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) * Vector length (VL) = size of a Z-register in bytes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) * Vector quadwords (VQ) = size of a Z-register in units of 128 bits
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) (So, VL = 16 * VQ.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) The VQ convention is used where the underlying granularity is important, such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) as in data structure definitions. In most other situations, the VL convention
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90) is used. This is consistent with the meaning of the "VL" pseudo-register in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) the SVE instruction set architecture.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) 3. System call behaviour
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95) -------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97) * On syscall, V0..V31 are preserved (as without SVE). Thus, bits [127:0] of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) Z0..Z31 are preserved. All other bits of Z0..Z31, and all of P0..P15 and FFR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) become unspecified on return from a syscall.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) * The SVE registers are not used to pass arguments to or receive results from
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) any syscall.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) * In practice the affected registers/bits will be preserved or will be replaced
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) with zeros on return from a syscall, but userspace should not make
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) assumptions about this. The kernel behaviour may vary on a case-by-case
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) basis.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) * All other SVE state of a thread, including the currently configured vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) length, the state of the PR_SVE_VL_INHERIT flag, and the deferred vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) length (if any), is preserved across all syscalls, subject to the specific
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) exceptions for execve() described in section 6.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) In particular, on return from a fork() or clone(), the parent and new child
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) process or thread share identical SVE configuration, matching that of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) parent before the call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) 4. Signal handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) -------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) * A new signal frame record sve_context encodes the SVE registers on signal
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) delivery. [1]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) * This record is supplementary to fpsimd_context. The FPSR and FPCR registers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) are only present in fpsimd_context. For convenience, the content of V0..V31
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) is duplicated between sve_context and fpsimd_context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) * The signal frame record for SVE always contains basic metadata, in particular
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) the thread's vector length (in sve_context.vl).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) * The SVE registers may or may not be included in the record, depending on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) whether the registers are live for the thread. The registers are present if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) and only if:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) sve_context.head.size >= SVE_SIG_CONTEXT_SIZE(sve_vq_from_vl(sve_context.vl)).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) * If the registers are present, the remainder of the record has a vl-dependent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) size and layout. Macros SVE_SIG_* are defined [1] to facilitate access to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) the members.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) * Each scalable register (Zn, Pn, FFR) is stored in an endianness-invariant
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) layout, with bits [(8 * i + 7) : (8 * i)] stored at byte offset i from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) start of the register's representation in memory.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) * If the SVE context is too big to fit in sigcontext.__reserved[], then extra
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) space is allocated on the stack, an extra_context record is written in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) __reserved[] referencing this space. sve_context is then written in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) extra space. Refer to [1] for further details about this mechanism.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 5. Signal return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) When returning from a signal handler:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) * If there is no sve_context record in the signal frame, or if the record is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) present but contains no register data as desribed in the previous section,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) then the SVE registers/bits become non-live and take unspecified values.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) * If sve_context is present in the signal frame and contains full register
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) data, the SVE registers become live and are populated with the specified
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) data. However, for backward compatibility reasons, bits [127:0] of Z0..Z31
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) are always restored from the corresponding members of fpsimd_context.vregs[]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) and not from sve_context. The remaining bits are restored from sve_context.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) * Inclusion of fpsimd_context in the signal frame remains mandatory,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) irrespective of whether sve_context is present or not.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) * The vector length cannot be changed via signal return. If sve_context.vl in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) the signal frame does not match the current vector length, the signal return
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) attempt is treated as illegal, resulting in a forced SIGSEGV.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 6. prctl extensions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) Some new prctl() calls are added to allow programs to manage the SVE vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) length:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) prctl(PR_SVE_SET_VL, unsigned long arg)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) Sets the vector length of the calling thread and related flags, where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) arg == vl | flags. Other threads of the calling process are unaffected.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) vl is the desired vector length, where sve_vl_valid(vl) must be true.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) flags:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) PR_SVE_VL_INHERIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) Inherit the current vector length across execve(). Otherwise, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) vector length is reset to the system default at execve(). (See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) Section 9.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) PR_SVE_SET_VL_ONEXEC
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) Defer the requested vector length change until the next execve()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) performed by this thread.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) The effect is equivalent to implicit exceution of the following
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) call immediately after the next execve() (if any) by the thread:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) prctl(PR_SVE_SET_VL, arg & ~PR_SVE_SET_VL_ONEXEC)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) This allows launching of a new program with a different vector
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) length, while avoiding runtime side effects in the caller.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) Without PR_SVE_SET_VL_ONEXEC, the requested change takes effect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) immediately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) Return value: a nonnegative on success, or a negative value on error:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) EINVAL: SVE not supported, invalid vector length requested, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) invalid flags.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) On success:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) * Either the calling thread's vector length or the deferred vector length
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) to be applied at the next execve() by the thread (dependent on whether
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) PR_SVE_SET_VL_ONEXEC is present in arg), is set to the largest value
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) supported by the system that is less than or equal to vl. If vl ==
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) SVE_VL_MAX, the value set will be the largest value supported by the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) * Any previously outstanding deferred vector length change in the calling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) thread is cancelled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) * The returned value describes the resulting configuration, encoded as for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) PR_SVE_GET_VL. The vector length reported in this value is the new
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) current vector length for this thread if PR_SVE_SET_VL_ONEXEC was not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) present in arg; otherwise, the reported vector length is the deferred
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) vector length that will be applied at the next execve() by the calling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) thread.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) * Changing the vector length causes all of P0..P15, FFR and all bits of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) unspecified. Calling PR_SVE_SET_VL with vl equal to the thread's current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) vector length, or calling PR_SVE_SET_VL with the PR_SVE_SET_VL_ONEXEC
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) flag, does not constitute a change to the vector length for this purpose.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) prctl(PR_SVE_GET_VL)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) Gets the vector length of the calling thread.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) The following flag may be OR-ed into the result:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) PR_SVE_VL_INHERIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) Vector length will be inherited across execve().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) There is no way to determine whether there is an outstanding deferred
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) vector length change (which would only normally be the case between a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) fork() or vfork() and the corresponding execve() in typical use).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) To extract the vector length from the result, and it with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) PR_SVE_VL_LEN_MASK.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) Return value: a nonnegative value on success, or a negative value on error:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) EINVAL: SVE not supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) 7. ptrace extensions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266) ---------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) * A new regset NT_ARM_SVE is defined for use with PTRACE_GETREGSET and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) PTRACE_SETREGSET.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) Refer to [2] for definitions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) The regset data starts with struct user_sve_header, containing:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) Size of the complete regset, in bytes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) This depends on vl and possibly on other things in the future.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) If a call to PTRACE_GETREGSET requests less data than the value of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) size, the caller can allocate a larger buffer and retry in order to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) read the complete regset.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) max_size
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) Maximum size in bytes that the regset can grow to for the target
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) thread. The regset won't grow bigger than this even if the target
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) thread changes its vector length etc.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) vl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) Target thread's current vector length, in bytes.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) max_vl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) Maximum possible vector length for the target thread.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) either
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) SVE_PT_REGS_FPSIMD
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) SVE registers are not live (GETREGSET) or are to be made
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) non-live (SETREGSET).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) The payload is of type struct user_fpsimd_state, with the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) meaning as for NT_PRFPREG, starting at offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) SVE_PT_FPSIMD_OFFSET from the start of user_sve_header.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) Extra data might be appended in the future: the size of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) payload should be obtained using SVE_PT_FPSIMD_SIZE(vq, flags).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) vq should be obtained using sve_vq_from_vl(vl).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) SVE_PT_REGS_SVE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) SVE registers are live (GETREGSET) or are to be made live
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) (SETREGSET).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) The payload contains the SVE register data, starting at offset
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) SVE_PT_SVE_OFFSET from the start of user_sve_header, and with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) size SVE_PT_SVE_SIZE(vq, flags);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) ... OR-ed with zero or more of the following flags, which have the same
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) meaning and behaviour as the corresponding PR_SET_VL_* flags:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) SVE_PT_VL_INHERIT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) SVE_PT_VL_ONEXEC (SETREGSET only).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) * The effects of changing the vector length and/or flags are equivalent to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) those documented for PR_SVE_SET_VL.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) The caller must make a further GETREGSET call if it needs to know what VL is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) actually set by SETREGSET, unless is it known in advance that the requested
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) VL is supported.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) * In the SVE_PT_REGS_SVE case, the size and layout of the payload depends on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) the header fields. The SVE_PT_SVE_*() macros are provided to facilitate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) access to the members.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) * In either case, for SETREGSET it is permissible to omit the payload, in which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) case only the vector length and flags are changed (along with any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) consequences of those changes).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) * For SETREGSET, if an SVE_PT_REGS_SVE payload is present and the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) requested VL is not supported, the effect will be the same as if the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) payload were omitted, except that an EIO error is reported. No
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) attempt is made to translate the payload data to the correct layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) for the vector length actually set. The thread's FPSIMD state is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) preserved, but the remaining bits of the SVE registers become
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) unspecified. It is up to the caller to translate the payload layout
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) for the actual VL and retry.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) * The effect of writing a partial, incomplete payload is unspecified.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 8. ELF coredump extensions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) ---------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) * A NT_ARM_SVE note will be added to each coredump for each thread of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) dumped process. The contents will be equivalent to the data that would have
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) been read if a PTRACE_GETREGSET of NT_ARM_SVE were executed for each thread
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) when the coredump was generated.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) 9. System runtime configuration
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) --------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) * To mitigate the ABI impact of expansion of the signal frame, a policy
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374) mechanism is provided for administrators, distro maintainers and developers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375) to set the default vector length for userspace processes:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377) /proc/sys/abi/sve_default_vector_length
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 379) Writing the text representation of an integer to this file sets the system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 380) default vector length to the specified value, unless the value is greater
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 381) than the maximum vector length supported by the system in which case the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 382) default vector length is set to that maximum.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 383)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 384) The result can be determined by reopening the file and reading its
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 385) contents.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 386)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 387) At boot, the default vector length is initially set to 64 or the maximum
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 388) supported vector length, whichever is smaller. This determines the initial
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 389) vector length of the init process (PID 1).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 390)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 391) Reading this file returns the current system default vector length.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 392)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 393) * At every execve() call, the new vector length of the new process is set to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 394) the system default vector length, unless
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 395)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 396) * PR_SVE_VL_INHERIT (or equivalently SVE_PT_VL_INHERIT) is set for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 397) calling thread, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 398)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 399) * a deferred vector length change is pending, established via the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 400) PR_SVE_SET_VL_ONEXEC flag (or SVE_PT_VL_ONEXEC).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 401)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 402) * Modifying the system default vector length does not affect the vector length
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 403) of any existing process or thread that does not make an execve() call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 404)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 405)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 406) Appendix A. SVE programmer's model (informative)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 407) =================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 408)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 409) This section provides a minimal description of the additions made by SVE to the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 410) ARMv8-A programmer's model that are relevant to this document.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 411)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 412) Note: This section is for information only and not intended to be complete or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 413) to replace any architectural specification.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 414)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 415) A.1. Registers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 416) ---------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 417)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 418) In A64 state, SVE adds the following:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 419)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 420) * 32 8VL-bit vector registers Z0..Z31
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 421) For each Zn, Zn bits [127:0] alias the ARMv8-A vector register Vn.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 422)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 423) A register write using a Vn register name zeros all bits of the corresponding
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 424) Zn except for bits [127:0].
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 425)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 426) * 16 VL-bit predicate registers P0..P15
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 427)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 428) * 1 VL-bit special-purpose predicate register FFR (the "first-fault register")
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 429)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 430) * a VL "pseudo-register" that determines the size of each vector register
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 431)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 432) The SVE instruction set architecture provides no way to write VL directly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 433) Instead, it can be modified only by EL1 and above, by writing appropriate
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 434) system registers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 435)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 436) * The value of VL can be configured at runtime by EL1 and above:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 437) 16 <= VL <= VLmax, where VL must be a multiple of 16.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 438)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 439) * The maximum vector length is determined by the hardware:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 440) 16 <= VLmax <= 256.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 441)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 442) (The SVE architecture specifies 256, but permits future architecture
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 443) revisions to raise this limit.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 444)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 445) * FPSR and FPCR are retained from ARMv8-A, and interact with SVE floating-point
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 446) operations in a similar way to the way in which they interact with ARMv8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 447) floating-point operations::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 448)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 449) 8VL-1 128 0 bit index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 450) +---- //// -----------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 451) Z0 | : V0 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 452) : :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 453) Z7 | : V7 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 454) Z8 | : * V8 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 455) : : :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 456) Z15 | : *V15 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 457) Z16 | : V16 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 458) : :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 459) Z31 | : V31 |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 460) +---- //// -----------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 461) 31 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 462) VL-1 0 +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 463) +---- //// --+ FPSR | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 464) P0 | | +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 465) : | | *FPCR | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 466) P15 | | +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 467) +---- //// --+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 468) FFR | | +-----+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 469) +---- //// --+ VL | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 470) +-----+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 471)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 472) (*) callee-save:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 473) This only applies to bits [63:0] of Z-/V-registers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 474) FPCR contains callee-save and caller-save bits. See [4] for details.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 475)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 476)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 477) A.2. Procedure call standard
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 478) -----------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 479)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 480) The ARMv8-A base procedure call standard is extended as follows with respect to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 481) the additional SVE register state:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 482)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 483) * All SVE register bits that are not shared with FP/SIMD are caller-save.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 484)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 485) * Z8 bits [63:0] .. Z15 bits [63:0] are callee-save.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 486)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 487) This follows from the way these bits are mapped to V8..V15, which are caller-
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 488) save in the base procedure call standard.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 489)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 490)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 491) Appendix B. ARMv8-A FP/SIMD programmer's model
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 492) ===============================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 493)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 494) Note: This section is for information only and not intended to be complete or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 495) to replace any architectural specification.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 496)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 497) Refer to [4] for more information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 498)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 499) ARMv8-A defines the following floating-point / SIMD register state:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 500)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 501) * 32 128-bit vector registers V0..V31
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 502) * 2 32-bit status/control registers FPSR, FPCR
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 503)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 504) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 505)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 506) 127 0 bit index
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 507) +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 508) V0 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 509) : : :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 510) V7 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 511) * V8 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 512) : : : :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 513) *V15 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 514) V16 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 515) : : :
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 516) V31 | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 517) +---------------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 518)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 519) 31 0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 520) +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 521) FPSR | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 522) +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 523) *FPCR | |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 524) +-------+
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 525)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 526) (*) callee-save:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 527) This only applies to bits [63:0] of V-registers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 528) FPCR contains a mixture of callee-save and caller-save bits.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 529)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 530)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 531) References
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 532) ==========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 533)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 534) [1] arch/arm64/include/uapi/asm/sigcontext.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 535) AArch64 Linux signal ABI definitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 536)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 537) [2] arch/arm64/include/uapi/asm/ptrace.h
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 538) AArch64 Linux ptrace ABI definitions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 539)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 540) [3] Documentation/arm64/cpu-feature-registers.rst
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 541)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 542) [4] ARM IHI0055C
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 543) http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 544) http://infocenter.arm.com/help/topic/com.arm.doc.subset.swdev.abi/index.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 545) Procedure Call Standard for the ARM 64-bit Architecture (AArch64)