^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) JITDUMP specification version 2
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2) Last Revised: 09/15/2016
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) Author: Stephane Eranian <eranian@gmail.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) --------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6) | Revision | Date | Description |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) --------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8) | 1 | 09/07/2016 | Initial revision |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) --------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) | 2 | 09/15/2016 | Add JIT_CODE_UNWINDING_INFO |
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) --------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) I/ Introduction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) This document describes the jitdump file format. The file is generated by Just-In-time compiler runtimes to save meta-data information about the generated code, such as address, size, and name of generated functions, the native code generated, the source line information. The data may then be used by performance tools, such as Linux perf to generate function and assembly level profiles.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) The format is not specific to any particular programming language. It can be extended as need be.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) The format of the file is binary. It is self-describing in terms of endianness and is portable across multiple processor architectures.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) II/ Overview of the format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) The format requires only sequential accesses, i.e., append only mode. The file starts with a fixed size file header describing the version of the specification, the endianness.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) The header is followed by a series of records, each starting with a fixed size header describing the type of record and its size. It is, itself, followed by the payload for the record. Records can have a variable size even for a given type.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31) Each entry in the file is timestamped. All timestamps must use the same clock source. The CLOCK_MONOTONIC clock source is recommended.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) III/ Jitdump file header format
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) Each jitdump file starts with a fixed size header containing the following fields in order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39) * uint32_t magic : a magic number tagging the file type. The value is 4-byte long and represents the string "JiTD" in ASCII form. It written is as 0x4A695444. The reader will detect an endian mismatch when it reads 0x4454694a.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) * uint32_t version : a 4-byte value representing the format version. It is currently set to 1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) * uint32_t total_size: size in bytes of file header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) * uint32_t elf_mach : ELF architecture encoding (ELF e_machine value as specified in /usr/include/elf.h)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43) * uint32_t pad1 : padding. Reserved for future use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) * uint32_t pid : JIT runtime process identification (OS specific)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) * uint64_t timestamp : timestamp of when the file was created
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) * uint64_t flags : a bitmask of flags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) The flags currently defined are as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) * bit 0: JITDUMP_FLAGS_ARCH_TIMESTAMP : set if the jitdump file is using an architecture-specific timestamp clock source. For instance, on x86, one could use TSC directly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51) IV/ Record header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) The file header is immediately followed by records. Each record starts with a fixed size header describing the record that follows.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) The record header is specified in order as follows:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56) * uint32_t id : a value identifying the record type (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) * uint32_t total_size: the size in bytes of the record including the header.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) * uint64_t timestamp : a timestamp of when the record was created.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) The following record types are defined:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61) * Value 0 : JIT_CODE_LOAD : record describing a jitted function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) * Value 1 : JIT_CODE_MOVE : record describing an already jitted function which is moved
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) * Value 2 : JIT_CODE_DEBUG_INFO: record describing the debug information for a jitted function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) * Value 3 : JIT_CODE_CLOSE : record marking the end of the jit runtime (optional)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) * Value 4 : JIT_CODE_UNWINDING_INFO: record describing a function unwinding information
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) The payload of the record must immediately follow the record header without padding.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) V/ JIT_CODE_LOAD record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) The record has the following fields following the fixed-size record header in order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73) * uint32_t pid: OS process id of the runtime generating the jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) * uint32_t tid: OS thread identification of the runtime thread generating the jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75) * uint64_t vma: virtual address of jitted code start
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) * uint64_t code_addr: code start address for the jitted code. By default vma = code_addr
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77) * uint64_t code_size: size in bytes of the generated jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) * uint64_t code_index: unique identifier for the jitted code (see below)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79) * char[n]: function name in ASCII including the null termination
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) * native code: raw byte encoding of the jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) The record header total_size field is inclusive of all components:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83) * record header
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) * fixed-sized fields
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) * function name string, including termination
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) * native code length
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) * record specific variable data (e.g., array of data entries)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) The code_index is used to uniquely identify each jitted function. The index can be a monotonically increasing 64-bit value. Each time a function is jitted it gets a new number. This value is used in case the code for a function is moved and avoids having to issue another JIT_CODE_LOAD record.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) The format supports empty functions with no native code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 92)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 93)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 94) VI/ JIT_CODE_MOVE record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 95)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 96) The record type is optional.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 97)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 98) The record has the following fields following the fixed-size record header in order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 99) * uint32_t pid : OS process id of the runtime generating the jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) * uint32_t tid : OS thread identification of the runtime thread generating the jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) * uint64_t vma : new virtual address of jitted code start
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) * uint64_t old_code_addr: previous code address for the same function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) * uint64_t new_code_addr: alternate new code started address for the jitted code. By default it should be equal to the vma address.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) * uint64_t code_size : size in bytes of the jitted code
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) * uint64_t code_index : index referring to the JIT_CODE_LOAD code_index record of when the function was initially jitted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) The MOVE record can be used in case an already jitted function is simply moved by the runtime inside the code cache.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) The JIT_CODE_MOVE record cannot come before the JIT_CODE_LOAD record for the same function name. The function cannot have changed name, otherwise a new JIT_CODE_LOAD record must be emitted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) The code size of the function cannot change.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) VII/ JIT_DEBUG_INFO record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) The record type is optional.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) The record contains source lines debug information, i.e., a way to map a code address back to a source line. This information may be used by the performance tool.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) The record has the following fields following the fixed-size record header in order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) * uint64_t code_addr: address of function for which the debug information is generated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) * uint64_t nr_entry : number of debug entries for the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) * debug_entry[n]: array of nr_entry debug entries for the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) The debug_entry describes the source line information. It is defined as follows in order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) * uint64_t code_addr: address of function for which the debug information is generated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) * uint32_t line : source file line number (starting at 1)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) * uint32_t discrim : column discriminator, 0 is default
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) * char name[n] : source file name in ASCII, including null termination
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) The debug_entry entries are saved in sequence but given that they have variable sizes due to the file name string, they cannot be indexed directly.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) They need to be walked sequentially. The next debug_entry is found at sizeof(debug_entry) + strlen(name) + 1.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) IMPORTANT:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) The JIT_CODE_DEBUG for a given function must always be generated BEFORE the JIT_CODE_LOAD for the function. This facilitates greatly the parser for the jitdump file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) VIII/ JIT_CODE_CLOSE record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) The record type is optional.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) The record is used as a marker for the end of the jitted runtime. It can be replaced by the end of the file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) The JIT_CODE_CLOSE record does not have any specific fields, the record header contains all the information needed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) IX/ JIT_CODE_UNWINDING_INFO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) The record type is optional.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) The record is used to describe the unwinding information for a jitted function.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) The record has the following fields following the fixed-size record header in order:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) uint64_t unwind_data_size : the size in bytes of the unwinding data table at the end of the record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) uint64_t eh_frame_hdr_size : the size in bytes of the DWARF EH Frame Header at the start of the unwinding data table at the end of the record
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) uint64_t mapped_size : the size of the unwinding data mapped in memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) const char unwinding_data[n]: an array of unwinding data, consisting of the EH Frame Header, followed by the actual EH Frame
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) The EH Frame header follows the Linux Standard Base (LSB) specification as described in the document at https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/ehframehdr.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) The EH Frame follows the LSB specicfication as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) NOTE: The mapped_size is generally either the same as unwind_data_size (if the unwinding data was mapped in memory by the running process) or zero (if the unwinding data is not mapped by the process). If the unwinding data was not mapped, then only the EH Frame Header will be read, which can be used to specify FP based unwinding for a function which does not have unwinding information.