Orange Pi5 kernel

Deprecated Linux kernel 5.10.110 for OrangePi 5/5B/5+ boards

3 Commits   0 Branches   0 Tags
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) Bug hunting
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) ===========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) Kernel bug reports often come with a stack dump like the one below::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 	------------[ cut here ]------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 	WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 	Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 	CPU: 1 PID: 28102 Comm: rmmod Tainted: P        WC O 4.8.4-build.1 #1
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) 	Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) 	 00000000 c12ba080 00000000 00000000 c103ed6a c1616014 00000001 00006dc6
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) 	 c1615862 00000454 c109e8a7 c109e8a7 00000009 ffffffff 00000000 f13f6a10
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) 	 f5f5a600 c103ee33 00000009 00000000 00000000 c109e8a7 f80ca4d0 c109f617
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) 	Call Trace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) 	 [<c12ba080>] ? dump_stack+0x44/0x64
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) 	 [<c103ed6a>] ? __warn+0xfa/0x120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) 	 [<c109e8a7>] ? module_put+0x57/0x70
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) 	 [<c109e8a7>] ? module_put+0x57/0x70
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 	 [<c103ee33>] ? warn_slowpath_null+0x23/0x30
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) 	 [<c109e8a7>] ? module_put+0x57/0x70
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) 	 [<f80ca4d0>] ? gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) 	 [<c109f617>] ? symbol_put_addr+0x27/0x50
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) 	 [<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) 	 [<f80bb3bf>] ? dvb_usb_exit+0x2f/0xd0 [dvb_usb]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) 	 [<c13d03bc>] ? usb_disable_endpoint+0x7c/0xb0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) 	 [<f80bb48a>] ? dvb_usb_device_exit+0x2a/0x50 [dvb_usb]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) 	 [<c13d2882>] ? usb_unbind_interface+0x62/0x250
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) 	 [<c136b514>] ? __pm_runtime_idle+0x44/0x70
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 	 [<c13620d8>] ? __device_release_driver+0x78/0x120
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) 	 [<c1362907>] ? driver_detach+0x87/0x90
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) 	 [<c1361c48>] ? bus_remove_driver+0x38/0x90
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 	 [<c13d1c18>] ? usb_deregister+0x58/0xb0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 	 [<c109fbb0>] ? SyS_delete_module+0x130/0x1f0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) 	 [<c1055654>] ? task_work_run+0x64/0x80
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) 	 [<c1000fa5>] ? exit_to_usermode_loop+0x85/0x90
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) 	 [<c10013f0>] ? do_fast_syscall_32+0x80/0x130
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) 	 [<c1549f43>] ? sysenter_past_esp+0x40/0x6a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) 	---[ end trace 6ebc60ef3981792f ]---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) Such stack traces provide enough information to identify the line inside the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) Kernel's source code where the bug happened. Depending on the severity of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) the issue, it may also contain the word **Oops**, as on this one::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) 	BUG: unable to handle kernel NULL pointer dereference at   (null)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) 	IP: [<c06969d4>] iret_exc+0x7d0/0xa59
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) 	*pdpt = 000000002258a001 *pde = 0000000000000000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) 	Oops: 0002 [#1] PREEMPT SMP
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) 	...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) Despite being an **Oops** or some other sort of stack trace, the offended
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) line is usually required to identify and handle the bug. Along this chapter,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) we'll refer to "Oops" for all kinds of stack traces that need to be analyzed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) If the kernel is compiled with ``CONFIG_DEBUG_INFO``, you can enhance the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) quality of the stack trace by using file:`scripts/decode_stacktrace.sh`.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) Modules linked in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) Modules that are tainted or are being loaded or unloaded are marked with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) "(...)", where the taint flags are described in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) file:`Documentation/admin-guide/tainted-kernels.rst`, "being loaded" is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) annotated with "+", and "being unloaded" is annotated with "-".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) Where is the Oops message is located?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) -------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) Normally the Oops text is read from the kernel buffers by klogd and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) handed to ``syslogd`` which writes it to a syslog file, typically
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) ``/var/log/messages`` (depends on ``/etc/syslog.conf``). On systems with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) systemd, it may also be stored by the ``journald`` daemon, and accessed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) by running ``journalctl`` command.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) Sometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) read the data from the kernel buffers and save it.  Or you can
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) ``cat /proc/kmsg > file``, however you have to break in to stop the transfer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) since ``kmsg`` is a "never ending file".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) If the machine has crashed so badly that you cannot enter commands or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) the disk is not available then you have three options:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) (1) Hand copy the text from the screen and type it in after the machine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84)     has restarted.  Messy but it is the only option if you have not
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85)     planned for a crash. Alternatively, you can take a picture of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86)     the screen with a digital camera - not nice, but better than
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87)     nothing.  If the messages scroll off the top of the console, you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88)     may find that booting with a higher resolution (e.g., ``vga=791``)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89)     will allow you to read more of the text. (Caveat: This needs ``vesafb``,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90)     so won't help for 'early' oopses.)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) (2) Boot with a serial console (see
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93)     :ref:`Documentation/admin-guide/serial-console.rst <serial_console>`),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94)     run a null modem to a second machine and capture the output there
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95)     using your favourite communication program.  Minicom works well.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) (3) Use Kdump (see Documentation/admin-guide/kdump/kdump.rst),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98)     extract the kernel ring buffer from old memory with using dmesg
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99)     gdbmacro in Documentation/admin-guide/kdump/gdbmacros.txt.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) Finding the bug's location
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) --------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) Reporting a bug works best if you point the location of the bug at the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) Kernel source file. There are two methods for doing that. Usually, using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) ``gdb`` is easier, but the Kernel should be pre-compiled with debug info.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) gdb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) ^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) The GNU debugger (``gdb``) is the best way to figure out the exact file and line
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) number of the OOPS from the ``vmlinux`` file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) The usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) This can be set by running::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117)   $ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) On a kernel compiled with ``CONFIG_DEBUG_INFO``, you can simply copy the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) EIP value from the OOPS::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122)  EIP:    0060:[<c021e50e>]    Not tainted VLI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) And use GDB to translate that to human-readable form::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126)   $ gdb vmlinux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127)   (gdb) l *0xc021e50e
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) If you don't have ``CONFIG_DEBUG_INFO`` enabled, you use the function
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) offset from the OOPS::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132)  EIP is at vt_ioctl+0xda8/0x1482
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) And recompile the kernel with ``CONFIG_DEBUG_INFO`` enabled::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136)   $ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137)   $ make vmlinux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138)   $ gdb vmlinux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139)   (gdb) l *vt_ioctl+0xda8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140)   0x1888 is in vt_ioctl (drivers/tty/vt/vt_ioctl.c:293).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141)   288	{
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142)   289		struct vc_data *vc = NULL;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143)   290		int ret = 0;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144)   291
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145)   292		console_lock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146)   293		if (VT_BUSY(vc_num))
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147)   294			ret = -EBUSY;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148)   295		else if (vc_num)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149)   296			vc = vc_deallocate(vc_num);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150)   297		console_unlock();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) or, if you want to be more verbose::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154)   (gdb) p vt_ioctl
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155)   $1 = {int (struct tty_struct *, unsigned int, unsigned long)} 0xae0 <vt_ioctl>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156)   (gdb) l *0xae0+0xda8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) You could, instead, use the object file::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160)   $ make drivers/tty/
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161)   $ gdb drivers/tty/vt/vt_ioctl.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162)   (gdb) l *vt_ioctl+0xda8
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) If you have a call trace, such as::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)      Call Trace:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167)       [<ffffffff8802c8e9>] :jbd:log_wait_commit+0xa3/0xf5
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)       [<ffffffff810482d9>] autoremove_wake_function+0x0/0x2e
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)       [<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170)       ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172) this shows the problem likely is in the :jbd: module. You can load that module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) in gdb and list the relevant code::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175)   $ gdb fs/jbd/jbd.ko
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176)   (gdb) l *log_wait_commit+0xa3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180)      You can also do the same for any function call at the stack trace,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181)      like this one::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) 	 [<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185)      The position where the above call happened can be seen with::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 	$ gdb drivers/media/usb/dvb-usb/dvb-usb.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 	(gdb) l *dvb_usb_adapter_frontend_exit+0x3a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) objdump
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) ^^^^^^^
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) To debug a kernel, use objdump and look for the hex offset from the crash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) output to find the valid line of code/assembler. Without debug symbols, you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) will see the assembler code for the routine shown, but if your kernel has
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) debug symbols the C code will also be available. (Debug symbols can be enabled
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) in the kernel hacking menu of the menu configuration.) For example::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199)     $ objdump -r -S -l --disassemble net/dccp/ipv4.o
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201) .. note::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203)    You need to be at the top level of the kernel tree for this to pick up
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)    your C files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206) If you don't have access to the source code you can still debug some crash
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207) dumps using the following method (example crash dump output as shown by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208) Dave Miller)::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210)      EIP is at 	+0x14/0x4c0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211)       ...
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212)      Code: 44 24 04 e8 6f 05 00 00 e9 e8 fe ff ff 8d 76 00 8d bc 27 00 00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213)      00 00 55 57  56 53 81 ec bc 00 00 00 8b ac 24 d0 00 00 00 8b 5d 08
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214)      <8b> 83 3c 01 00 00 89 44  24 14 8b 45 28 85 c0 89 44 24 18 0f 85
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216)      Put the bytes into a "foo.s" file like this:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)             .text
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219)             .globl foo
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)      foo:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221)             .byte  .... /* bytes from Code: part of OOPS dump */
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223)      Compile it with "gcc -c -o foo.o foo.s" then look at the output of
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224)      "objdump --disassemble foo.o".
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226)      Output:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228)      ip_queue_xmit:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)          push       %ebp
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)          push       %edi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231)          push       %esi
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232)          push       %ebx
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)          sub        $0xbc, %esp
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234)          mov        0xd0(%esp), %ebp        ! %ebp = arg0 (skb)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235)          mov        0x8(%ebp), %ebx         ! %ebx = skb->sk
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)          mov        0x13c(%ebx), %eax       ! %eax = inet_sk(sk)->opt
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238) file:`scripts/decodecode` can be used to automate most of this, depending
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239) on what CPU architecture is being debugged.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241) Reporting the bug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242) -----------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244) Once you find where the bug happened, by inspecting its location,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) you could either try to fix it yourself or report it upstream.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) In order to report it upstream, you should identify the mailing list
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) used for the development of the affected code. This can be done by using
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) the ``get_maintainer.pl`` script.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) For example, if you find a bug at the gspca's sonixj.c file, you can get
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) its maintainers with::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 	$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) 	Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 	Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) 	Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) 	Bhaktipriya Shridhar <bhaktipriya96@gmail.com> (commit_signer:1/1=100%,authored:1/1=100%,added_lines:4/4=100%,removed_lines:9/9=100%)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) 	linux-media@vger.kernel.org (open list:GSPCA USB WEBCAM DRIVER)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260) 	linux-kernel@vger.kernel.org (open list)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262) Please notice that it will point to:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264) - The last developers that touched the source code (if this is done inside
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265)   a git tree). On the above example, Tejun and Bhaktipriya (in this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266)   specific case, none really envolved on the development of this file);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267) - The driver maintainer (Hans Verkuil);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268) - The subsystem maintainer (Mauro Carvalho Chehab);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269) - The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270) - the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272) Usually, the fastest way to have your bug fixed is to report it to mailing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273) list used for the development of the code (linux-media ML) copying the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274) driver maintainer (Hans).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276) If you are totally stumped as to whom to send the report, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277) ``get_maintainer.pl`` didn't provide you anything useful, send it to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278) linux-kernel@vger.kernel.org.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280) Thanks for your help in making Linux as stable as humanly possible.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282) Fixing the bug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283) --------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285) If you know programming, you could help us by not only reporting the bug,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286) but also providing us with a solution. After all, open source is about
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287) sharing what you do and don't you want to be recognised for your genius?
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289) If you decide to take this way, once you have worked out a fix please submit
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290) it upstream.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292) Please do read
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293) :ref:`Documentation/process/submitting-patches.rst <submittingpatches>` though
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294) to help your code get accepted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) ---------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) Notes on Oops tracing with ``klogd``
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) ------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) In order to help Linus and the other kernel developers there has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) substantial support incorporated into ``klogd`` for processing protection
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) faults.  In order to have full support for address resolution at least
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) version 1.3-pl3 of the ``sysklogd`` package should be used.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307) When a protection fault occurs the ``klogd`` daemon automatically
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308) translates important addresses in the kernel log messages to their
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309) symbolic equivalents.  This translated kernel message is then
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) forwarded through whatever reporting mechanism ``klogd`` is using.  The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) protection fault message can be simply cut out of the message files
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312) and forwarded to the kernel developers.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314) Two types of address resolution are performed by ``klogd``.  The first is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315) static translation and the second is dynamic translation.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316) Static translation uses the System.map file.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) In order to do static translation the ``klogd`` daemon
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318) must be able to find a system map file at daemon initialization time.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319) See the klogd man page for information on how ``klogd`` searches for map
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320) files.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322) Dynamic address translation is important when kernel loadable modules
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323) are being used.  Since memory for kernel modules is allocated from the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324) kernel's dynamic memory pools there are no fixed locations for either
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325) the start of the module or for functions and symbols in the module.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327) The kernel supports system calls which allow a program to determine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) which modules are loaded and their location in memory.  Using these
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) system calls the klogd daemon builds a symbol table which can be used
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330) to debug a protection fault which occurs in a loadable kernel module.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) At the very minimum klogd will provide the name of the module which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) generated the protection fault.  There may be additional symbolic
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) information available if the developer of the loadable module chose to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) export symbol information from the module.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 337) Since the kernel module environment can be dynamic there must be a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 338) mechanism for notifying the ``klogd`` daemon when a change in module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 339) environment occurs.  There are command line options available which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 340) allow klogd to signal the currently executing daemon that symbol
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 341) information should be refreshed.  See the ``klogd`` manual page for more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 342) information.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 343) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 344) A patch is included with the sysklogd distribution which modifies the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 345) ``modules-2.0.0`` package to automatically signal klogd whenever a module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 346) is loaded or unloaded.  Applying this patch provides essentially
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 347) seamless support for debugging protection faults which occur with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 348) kernel loadable modules.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 349) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 350) The following is an example of a protection fault in a loadable module
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 351) processed by ``klogd``::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 352) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 353) 	Aug 29 09:51:01 blizard kernel: Unable to handle kernel paging request at virtual address f15e97cc
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 354) 	Aug 29 09:51:01 blizard kernel: current->tss.cr3 = 0062d000, %cr3 = 0062d000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 355) 	Aug 29 09:51:01 blizard kernel: *pde = 00000000
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 356) 	Aug 29 09:51:01 blizard kernel: Oops: 0002
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 357) 	Aug 29 09:51:01 blizard kernel: CPU:    0
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 358) 	Aug 29 09:51:01 blizard kernel: EIP:    0010:[oops:_oops+16/3868]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 359) 	Aug 29 09:51:01 blizard kernel: EFLAGS: 00010212
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 360) 	Aug 29 09:51:01 blizard kernel: eax: 315e97cc   ebx: 003a6f80   ecx: 001be77b   edx: 00237c0c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 361) 	Aug 29 09:51:01 blizard kernel: esi: 00000000   edi: bffffdb3   ebp: 00589f90   esp: 00589f8c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 362) 	Aug 29 09:51:01 blizard kernel: ds: 0018   es: 0018   fs: 002b   gs: 002b   ss: 0018
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 363) 	Aug 29 09:51:01 blizard kernel: Process oops_test (pid: 3374, process nr: 21, stackpage=00589000)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 364) 	Aug 29 09:51:01 blizard kernel: Stack: 315e97cc 00589f98 0100b0b4 bffffed4 0012e38e 00240c64 003a6f80 00000001
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 365) 	Aug 29 09:51:01 blizard kernel:        00000000 00237810 bfffff00 0010a7fa 00000003 00000001 00000000 bfffff00
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 366) 	Aug 29 09:51:01 blizard kernel:        bffffdb3 bffffed4 ffffffda 0000002b 0007002b 0000002b 0000002b 00000036
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 367) 	Aug 29 09:51:01 blizard kernel: Call Trace: [oops:_oops_ioctl+48/80] [_sys_ioctl+254/272] [_system_call+82/128]
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 368) 	Aug 29 09:51:01 blizard kernel: Code: c7 00 05 00 00 00 eb 08 90 90 90 90 90 90 90 90 89 ec 5d c3
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 369) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 370) ---------------------------------------------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 371) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 372) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 373) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 374)   Dr. G.W. Wettstein           Oncology Research Div. Computing Facility
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 375)   Roger Maris Cancer Center    INTERNET: greg@wind.rmcc.com
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 376)   820 4th St. N.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 377)   Fargo, ND  58122
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 378)   Phone: 701-234-7556