Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) Light-weight System Calls for IA-64
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ===================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) 		        Started: 13-Jan-2003
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 		    Last update: 27-Sep-2003
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 	              David Mosberger-Tang
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 		      <davidm@hpl.hp.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) Using the "epc" instruction effectively introduces a new mode of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) execution to the ia64 linux kernel.  We call this mode the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) "fsys-mode".  To recap, the normal states of execution are:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16)   - kernel mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 	Both the register stack and the memory stack have been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 	switched over to kernel memory.  The user-level state is saved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 	in a pt-regs structure at the top of the kernel memory stack.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21)   - user mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 	Both the register stack and the kernel stack are in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) 	user memory.  The user-level state is contained in the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 	CPU registers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26)   - bank 0 interruption-handling mode:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 	This is the non-interruptible state which all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 	interruption-handlers start execution in.  The user-level
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 	state remains in the CPU registers and some kernel state may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 	be stored in bank 0 of registers r16-r31.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) In contrast, fsys-mode has the following special properties:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34)   - execution is at privilege level 0 (most-privileged)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36)   - CPU registers may contain a mixture of user-level and kernel-level
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37)     state (it is the responsibility of the kernel to ensure that no
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38)     security-sensitive kernel-level state is leaked back to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39)     user-level)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41)   - execution is interruptible and preemptible (an fsys-mode handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42)     can disable interrupts and avoid all other interruption-sources
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43)     to avoid preemption)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45)   - neither the memory-stack nor the register-stack can be trusted while
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46)     in fsys-mode (they point to the user-level stacks, which may
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47)     be invalid, or completely bogus addresses)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) In summary, fsys-mode is much more similar to running in user-mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) than it is to running in kernel-mode.  Of course, given that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) privilege level is at level 0, this means that fsys-mode requires some
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) care (see below).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) How to tell fsys-mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) =====================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) Linux operates in fsys-mode when (a) the privilege level is 0 (most
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) privileged) and (b) the stacks have NOT been switched to kernel memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) yet.  For convenience, the header file <asm-ia64/ptrace.h> provides
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) three macros::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) 	user_mode(regs)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) 	user_stack(task,regs)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 	fsys_mode(task,regs)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) The "regs" argument is a pointer to a pt_regs structure.  The "task"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) argument is a pointer to the task structure to which the "regs"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) pointer belongs to.  user_mode() returns TRUE if the CPU state pointed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) to by "regs" was executing in user mode (privilege level 3).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) user_stack() returns TRUE if the state pointed to by "regs" was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) executing on the user-level stack(s).  Finally, fsys_mode() returns
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) TRUE if the CPU state pointed to by "regs" was executing in fsys-mode.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) The fsys_mode() macro is equivalent to the expression::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) 	!user_mode(regs) && user_stack(task,regs)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) How to write an fsyscall handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) ================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) The file arch/ia64/kernel/fsys.S contains a table of fsyscall-handlers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) (fsyscall_table).  This table contains one entry for each system call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) By default, a system call is handled by fsys_fallback_syscall().  This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) routine takes care of entering (full) kernel mode and calling the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) normal Linux system call handler.  For performance-critical system
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) calls, it is possible to write a hand-tuned fsyscall_handler.  For
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) example, fsys.S contains fsys_getpid(), which is a hand-tuned version
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) of the getpid() system call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) The entry and exit-state of an fsyscall handler is as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) Machine state on entry to fsyscall handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) ------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95)   ========= ===============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96)   r10	    0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97)   r11	    saved ar.pfs (a user-level value)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98)   r15	    system call number
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99)   r16	    "current" task pointer (in normal kernel-mode, this is in r13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100)   r32-r39   system call arguments
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101)   b6	    return address (a user-level value)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102)   ar.pfs    previous frame-state (a user-level value)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103)   PSR.be    cleared to zero (i.e., little-endian byte order is in effect)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104)   -         all other registers may contain values passed in from user-mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105)   ========= ===============================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) Required machine state on exit to fsyscall handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) --------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110)   ========= ===========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)   r11	    saved ar.pfs (as passed into the fsyscall handler)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112)   r15	    system call number (as passed into the fsyscall handler)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)   r32-r39   system call arguments (as passed into the fsyscall handler)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)   b6	    return address (as passed into the fsyscall handler)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115)   ar.pfs    previous frame-state (as passed into the fsyscall handler)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)   ========= ===========================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) Fsyscall handlers can execute with very little overhead, but with that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) speed comes a set of restrictions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121)  * Fsyscall-handlers MUST check for any pending work in the flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)    member of the thread-info structure and if any of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123)    TIF_ALLWORK_MASK flags are set, the handler needs to fall back on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124)    doing a full system call (by calling fsys_fallback_syscall).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)  * Fsyscall-handlers MUST preserve incoming arguments (r32-r39, r11,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)    r15, b6, and ar.pfs) because they will be needed in case of a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128)    system call restart.  Of course, all "preserved" registers also
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129)    must be preserved, in accordance to the normal calling conventions.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)  * Fsyscall-handlers MUST check argument registers for containing a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)    NaT value before using them in any way that could trigger a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133)    NaT-consumption fault.  If a system call argument is found to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)    contain a NaT value, an fsyscall-handler may return immediately
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135)    with r8=EINVAL, r10=-1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)  * Fsyscall-handlers MUST NOT use the "alloc" instruction or perform
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)    any other operation that would trigger mandatory RSE
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)    (register-stack engine) traffic.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)  * Fsyscall-handlers MUST NOT write to any stacked registers because
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142)    it is not safe to assume that user-level called a handler with the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)    proper number of arguments.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)  * Fsyscall-handlers need to be careful when accessing per-CPU variables:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)    unless proper safe-guards are taken (e.g., interruptions are avoided),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)    execution may be pre-empted and resumed on another CPU at any given
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)    time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)  * Fsyscall-handlers must be careful not to leak sensitive kernel'
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)    information back to user-level.  In particular, before returning to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152)    user-level, care needs to be taken to clear any scratch registers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)    that could contain sensitive information (note that the current
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)    task pointer is not considered sensitive: it's already exposed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)    through ar.k6).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)  * Fsyscall-handlers MUST NOT access user-memory without first
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158)    validating access-permission (this can be done typically via
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159)    probe.r.fault and/or probe.w.fault) and without guarding against
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)    memory access exceptions (this can be done with the EX() macros
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)    defined by asmmacro.h).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) The above restrictions may seem draconian, but remember that it's
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) possible to trade off some of the restrictions by paying a slightly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) higher overhead.  For example, if an fsyscall-handler could benefit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166) from the shadow register bank, it could temporarily disable PSR.i and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) PSR.ic, switch to bank 0 (bsw.0) and then use the shadow registers as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168) needed.  In other words, following the above rules yields extremely
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169) fast system call execution (while fully preserving system call
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) semantics), but there is also a lot of flexibility in handling more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) complicated cases.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) Signal handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) ===============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) The delivery of (asynchronous) signals must be delayed until fsys-mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) is exited.  This is accomplished with the help of the lower-privilege
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) transfer trap: arch/ia64/kernel/process.c:do_notify_resume_user()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) checks whether the interrupted task was in fsys-mode and, if so, sets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) PSR.lp and returns immediately.  When fsys-mode is exited via the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) "br.ret" instruction that lowers the privilege level, a trap will
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) occur.  The trap handler clears PSR.lp again and returns immediately.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) The kernel exit path then checks for and delivers any pending signals.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) PSR Handling
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) ============
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) The "epc" instruction doesn't change the contents of PSR at all.  This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) is in contrast to a regular interruption, which clears almost all
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) bits.  Because of that, some care needs to be taken to ensure things
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) work as expected.  The following discussion describes how each PSR bit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) is handled.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) ======= =======================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) PSR.be	Cleared when entering fsys-mode.  A srlz.d instruction is used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) 	to ensure the CPU is in little-endian mode before the first
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) 	load/store instruction is executed.  PSR.be is normally NOT
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) 	restored upon return from an fsys-mode handler.  In other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199) 	words, user-level code must not rely on PSR.be being preserved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) 	across a system call.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) PSR.up	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) PSR.ac	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203) PSR.mfl Unchanged.  Note: fsys-mode handlers must not write-registers!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204) PSR.mfh	Unchanged.  Note: fsys-mode handlers must not write-registers!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) PSR.ic	Unchanged.  Note: fsys-mode handlers can clear the bit, if needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) PSR.i	Unchanged.  Note: fsys-mode handlers can clear the bit, if needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) PSR.pk	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) PSR.dt	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) PSR.dfl	Unchanged.  Note: fsys-mode handlers must not write-registers!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210) PSR.dfh	Unchanged.  Note: fsys-mode handlers must not write-registers!
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211) PSR.sp	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212) PSR.pp	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213) PSR.di	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214) PSR.si	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) PSR.db	Unchanged.  The kernel prevents user-level from setting a hardware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216) 	breakpoint that triggers at any privilege level other than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) 	3 (user-mode).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218) PSR.lp	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219) PSR.tb	Lazy redirect.  If a taken-branch trap occurs while in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220) 	fsys-mode, the trap-handler modifies the saved machine state
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221) 	such that execution resumes in the gate page at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) 	syscall_via_break(), with privilege level 3.  Note: the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223) 	taken branch would occur on the branch invoking the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224) 	fsyscall-handler, at which point, by definition, a syscall
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) 	restart is still safe.  If the system call number is invalid,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226) 	the fsys-mode handler will return directly to user-level.  This
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) 	return will trigger a taken-branch trap, but since the trap is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228) 	taken _after_ restoring the privilege level, the CPU has already
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229) 	left fsys-mode, so no special treatment is needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230) PSR.rt	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231) PSR.cpl	Cleared to 0.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232) PSR.is	Unchanged (guaranteed to be 0 on entry to the gate page).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233) PSR.mc	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234) PSR.it	Unchanged (guaranteed to be 1).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235) PSR.id	Unchanged.  Note: the ia64 linux kernel never sets this bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236) PSR.da	Unchanged.  Note: the ia64 linux kernel never sets this bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) PSR.dd	Unchanged.  Note: the ia64 linux kernel never sets this bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) PSR.ss	Lazy redirect.  If set, "epc" will cause a Single Step Trap to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) 	be taken.  The trap handler then modifies the saved machine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 	state such that execution resumes in the gate page at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) 	syscall_via_break(), with privilege level 3.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) PSR.ri	Unchanged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) PSR.ed	Unchanged.  Note: This bit could only have an effect if an fsys-mode
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) 	handler performed a speculative load that gets NaTted.  If so, this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 	would be the normal & expected behavior, so no special treatment is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 	needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) PSR.bn	Unchanged.  Note: fsys-mode handlers may clear the bit, if needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) 	Doing so requires clearing PSR.i and PSR.ic as well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) PSR.ia	Unchanged.  Note: the ia64 linux kernel never sets this bit.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) ======= =======================================================================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) Using fast system calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) =======================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) To use fast system calls, userspace applications need simply call
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) __kernel_syscall_via_epc().  For example
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) -- example fgettimeofday() call --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) -- fgettimeofday.S --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264)   #include <asm/asmmacro.h>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266)   GLOBAL_ENTRY(fgettimeofday)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)   .prologue
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)   .save ar.pfs, r11
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269)   mov r11 = ar.pfs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)   .body
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)   mov r2 = 0xa000000000020660;;  // gate address
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) 			       // found by inspection of System.map for the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) 			       // __kernel_syscall_via_epc() function.  See
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) 			       // below for how to do this for real.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277)   mov b7 = r2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278)   mov r15 = 1087		       // gettimeofday syscall
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)   ;;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280)   br.call.sptk.many b6 = b7
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281)   ;;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283)   .restore sp
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285)   mov ar.pfs = r11
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286)   br.ret.sptk.many rp;;	      // return to caller
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287)   END(fgettimeofday)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) -- end fgettimeofday.S --
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) In reality, getting the gate address is accomplished by two extra
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) values passed via the ELF auxiliary vector (include/asm-ia64/elf.h)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294)  * AT_SYSINFO : is the address of __kernel_syscall_via_epc()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295)  * AT_SYSINFO_EHDR : is the address of the kernel gate ELF DSO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) The ELF DSO is a pre-linked library that is mapped in by the kernel at
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) the gate page.  It is a proper ELF shared object so, with a dynamic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) loader that recognises the library, you should be able to make calls to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) the exported functions within it as with any other shared library.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) AT_SYSINFO points into the kernel DSO at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) __kernel_syscall_via_epc() function for historical reasons (it was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) used before the kernel DSO) and as a convenience.