Orange Pi5 kernel

^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   1) ==========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   2) PCI Bus EEH Error Recovery
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   3) ==========================
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   4) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   5) Linas Vepstas <linas@austin.ibm.com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   6) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   7) 12 January 2005
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   8) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300   9) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  10) Overview:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  11) ---------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  12) The IBM POWER-based pSeries and iSeries computers include PCI bus
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  13) controller chips that have extended capabilities for detecting and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  14) reporting a large variety of PCI bus error conditions.  These features
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  15) go under the name of "EEH", for "Enhanced Error Handling".  The EEH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  16) hardware features allow PCI bus errors to be cleared and a PCI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  17) card to be "rebooted", without also having to reboot the operating
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  18) system.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  19) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  20) This is in contrast to traditional PCI error handling, where the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  21) PCI chip is wired directly to the CPU, and an error would cause
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  22) a CPU machine-check/check-stop condition, halting the CPU entirely.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  23) Another "traditional" technique is to ignore such errors, which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  24) can lead to data corruption, both of user data or of kernel data,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  25) hung/unresponsive adapters, or system crashes/lockups.  Thus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  26) the idea behind EEH is that the operating system can become more
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  27) reliable and robust by protecting it from PCI errors, and giving
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  28) the OS the ability to "reboot"/recover individual PCI devices.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  29) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  30) Future systems from other vendors, based on the PCI-E specification,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  31) may contain similar features.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  32) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  33) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  34) Causes of EEH Errors
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  35) --------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  36) EEH was originally designed to guard against hardware failure, such
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  37) as PCI cards dying from heat, humidity, dust, vibration and bad
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  38) electrical connections. The vast majority of EEH errors seen in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  39) "real life" are due to either poorly seated PCI cards, or,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  40) unfortunately quite commonly, due to device driver bugs, device firmware
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  41) bugs, and sometimes PCI card hardware bugs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  42) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  43) The most common software bug, is one that causes the device to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  44) attempt to DMA to a location in system memory that has not been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  45) reserved for DMA access for that card.  This is a powerful feature,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  46) as it prevents what; otherwise, would have been silent memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  47) corruption caused by the bad DMA.  A number of device driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  48) bugs have been found and fixed in this way over the past few
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  49) years.  Other possible causes of EEH errors include data or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  50) address line parity errors (for example, due to poor electrical
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  51) connectivity due to a poorly seated card), and PCI-X split-completion
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  52) errors (due to software, device firmware, or device PCI hardware bugs).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  53) The vast majority of "true hardware failures" can be cured by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  54) physically removing and re-seating the PCI card.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  55) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  56) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  57) Detection and Recovery
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  58) ----------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  59) In the following discussion, a generic overview of how to detect
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  60) and recover from EEH errors will be presented. This is followed
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  61) by an overview of how the current implementation in the Linux
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  62) kernel does it.  The actual implementation is subject to change,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  63) and some of the finer points are still being debated.  These
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  64) may in turn be swayed if or when other architectures implement
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  65) similar functionality.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  66) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  67) When a PCI Host Bridge (PHB, the bus controller connecting the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  68) PCI bus to the system CPU electronics complex) detects a PCI error
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  69) condition, it will "isolate" the affected PCI card.  Isolation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  70) will block all writes (either to the card from the system, or
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  71) from the card to the system), and it will cause all reads to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  72) return all-ff's (0xff, 0xffff, 0xffffffff for 8/16/32-bit reads).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  73) This value was chosen because it is the same value you would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  74) get if the device was physically unplugged from the slot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  75) This includes access to PCI memory, I/O space, and PCI config
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  76) space.  Interrupts; however, will continued to be delivered.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  77) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  78) Detection and recovery are performed with the aid of ppc64
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  79) firmware.  The programming interfaces in the Linux kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  80) into the firmware are referred to as RTAS (Run-Time Abstraction
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  81) Services).  The Linux kernel does not (should not) access
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  82) the EEH function in the PCI chipsets directly, primarily because
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  83) there are a number of different chipsets out there, each with
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  84) different interfaces and quirks. The firmware provides a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  85) uniform abstraction layer that will work with all pSeries
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  86) and iSeries hardware (and be forwards-compatible).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  87) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  88) If the OS or device driver suspects that a PCI slot has been
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  89) EEH-isolated, there is a firmware call it can make to determine if
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  90) this is the case. If so, then the device driver should put itself
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  91) into a consistent state (given that it won't be able to complete any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  92) pending work) and start recovery of the card.  Recovery normally
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  93) would consist of resetting the PCI device (holding the PCI #RST
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  94) line high for two seconds), followed by setting up the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  95) config space (the base address registers (BAR's), latency timer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  96) cache line size, interrupt line, and so on).  This is followed by a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  97) reinitialization of the device driver.  In a worst-case scenario,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  98) the power to the card can be toggled, at least on hot-plug-capable
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300  99) slots.  In principle, layers far above the device driver probably
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 100) do not need to know that the PCI card has been "rebooted" in this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 101) way; ideally, there should be at most a pause in Ethernet/disk/USB
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 102) I/O while the card is being reset.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 103) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 104) If the card cannot be recovered after three or four resets, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 105) kernel/device driver should assume the worst-case scenario, that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 106) card has died completely, and report this error to the sysadmin.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 107) In addition, error messages are reported through RTAS and also through
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 108) syslogd (/var/log/messages) to alert the sysadmin of PCI resets.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 109) The correct way to deal with failed adapters is to use the standard
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 110) PCI hotplug tools to remove and replace the dead card.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 111) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 112) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 113) Current PPC64 Linux EEH Implementation
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 114) --------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 115) At this time, a generic EEH recovery mechanism has been implemented,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 116) so that individual device drivers do not need to be modified to support
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 117) EEH recovery.  This generic mechanism piggy-backs on the PCI hotplug
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 118) infrastructure,  and percolates events up through the userspace/udev
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 119) infrastructure.  Following is a detailed description of how this is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 120) accomplished.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 121) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 122) EEH must be enabled in the PHB's very early during the boot process,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 123) and if a PCI slot is hot-plugged. The former is performed by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 124) eeh_init() in arch/powerpc/platforms/pseries/eeh.c, and the later by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 125) drivers/pci/hotplug/pSeries_pci.c calling in to the eeh.c code.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 126) EEH must be enabled before a PCI scan of the device can proceed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 127) Current Power5 hardware will not work unless EEH is enabled;
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 128) although older Power4 can run with it disabled.  Effectively,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 129) EEH can no longer be turned off.  PCI devices *must* be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 130) registered with the EEH code; the EEH code needs to know about
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 131) the I/O address ranges of the PCI device in order to detect an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 132) error.  Given an arbitrary address, the routine
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 133) pci_get_device_by_addr() will find the pci device associated
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 134) with that address (if any).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 135) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 136) The default arch/powerpc/include/asm/io.h macros readb(), inb(), insb(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 137) etc. include a check to see if the i/o read returned all-0xff's.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 138) If so, these make a call to eeh_dn_check_failure(), which in turn
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 139) asks the firmware if the all-ff's value is the sign of a true EEH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 140) error.  If it is not, processing continues as normal.  The grand
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 141) total number of these false alarms or "false positives" can be
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 142) seen in /proc/ppc64/eeh (subject to change).  Normally, almost
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 143) all of these occur during boot, when the PCI bus is scanned, where
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 144) a large number of 0xff reads are part of the bus scan procedure.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 145) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 146) If a frozen slot is detected, code in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 147) arch/powerpc/platforms/pseries/eeh.c will print a stack trace to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 148) syslog (/var/log/messages).  This stack trace has proven to be very
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 149) useful to device-driver authors for finding out at what point the EEH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 150) error was detected, as the error itself usually occurs slightly
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 151) beforehand.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 152) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 153) Next, it uses the Linux kernel notifier chain/work queue mechanism to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 154) allow any interested parties to find out about the failure.  Device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 155) drivers, or other parts of the kernel, can use
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 156) `eeh_register_notifier(struct notifier_block *)` to find out about EEH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 157) events.  The event will include a pointer to the pci device, the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 158) device node and some state info.  Receivers of the event can "do as
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 159) they wish"; the default handler will be described further in this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 160) section.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 161) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 162) To assist in the recovery of the device, eeh.c exports the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 163) following functions:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 164) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 165) rtas_set_slot_reset()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 166)    assert the  PCI #RST line for 1/8th of a second
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 167) rtas_configure_bridge()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 168)    ask firmware to configure any PCI bridges
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 169)    located topologically under the pci slot.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 170) eeh_save_bars() and eeh_restore_bars():
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 171)    save and restore the PCI
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 172)    config-space info for a device and any devices under it.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 173) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 174) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 175) A handler for the EEH notifier_block events is implemented in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 176) drivers/pci/hotplug/pSeries_pci.c, called handle_eeh_events().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 177) It saves the device BAR's and then calls rpaphp_unconfig_pci_adapter().
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 178) This last call causes the device driver for the card to be stopped,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 179) which causes uevents to go out to user space. This triggers
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 180) user-space scripts that might issue commands such as "ifdown eth0"
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 181) for ethernet cards, and so on.  This handler then sleeps for 5 seconds,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 182) hoping to give the user-space scripts enough time to complete.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 183) It then resets the PCI card, reconfigures the device BAR's, and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 184) any bridges underneath. It then calls rpaphp_enable_pci_slot(),
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 185) which restarts the device driver and triggers more user-space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 186) events (for example, calling "ifup eth0" for ethernet cards).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 187) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 188) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 189) Device Shutdown and User-Space Events
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 190) -------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 191) This section documents what happens when a pci slot is unconfigured,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 192) focusing on how the device driver gets shut down, and on how the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 193) events get delivered to user-space scripts.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 194) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 195) Following is an example sequence of events that cause a device driver
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 196) close function to be called during the first phase of an EEH reset.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 197) The following sequence is an example of the pcnet32 device driver::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 198) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 199)     rpa_php_unconfig_pci_adapter (struct slot *)  // in rpaphp_pci.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 200)     {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 201)       calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 202)       pci_remove_bus_device (struct pci_dev *) // in /drivers/pci/remove.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 203)       {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 204)         calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 205)         pci_destroy_dev (struct pci_dev *)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 206)         {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 207)           calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 208)           device_unregister (&dev->dev) // in /drivers/base/core.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 209)           {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 210)             calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 211)             device_del (struct device *)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 212)             {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 213)               calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 214)               bus_remove_device() // in /drivers/base/bus.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 215)               {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 216)                 calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 217)                 device_release_driver()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 218)                 {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 219)                   calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 220)                   struct device_driver->remove() which is just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 221)                   pci_device_remove()  // in /drivers/pci/pci_driver.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 222)                   {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 223)                     calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 224)                     struct pci_driver->remove() which is just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 225)                     pcnet32_remove_one() // in /drivers/net/pcnet32.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 226)                     {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 227)                       calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 228)                       unregister_netdev() // in /net/core/dev.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 229)                       {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 230)                         calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 231)                         dev_close()  // in /net/core/dev.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 232)                         {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 233)                            calls dev->stop();
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 234)                            which is just pcnet32_close() // in pcnet32.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 235)                            {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 236)                              which does what you wanted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 237)                              to stop the device
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 238)                            }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 239)                         }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 240)                      }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 241)                    which
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 242)                    frees pcnet32 device driver memory
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 243)                 }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 244)      }}}}}}
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 245) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 246) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 247) in drivers/pci/pci_driver.c,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 248) struct device_driver->remove() is just pci_device_remove()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 249) which calls struct pci_driver->remove() which is pcnet32_remove_one()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 250) which calls unregister_netdev()  (in net/core/dev.c)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 251) which calls dev_close()  (in net/core/dev.c)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 252) which calls dev->stop() which is pcnet32_close()
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 253) which then does the appropriate shutdown.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 254) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 255) ---
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 256) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 257) Following is the analogous stack trace for events sent to user-space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 258) when the pci device is unconfigured::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 259) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 260)   rpa_php_unconfig_pci_adapter() {             // in rpaphp_pci.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 261)     calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 262)     pci_remove_bus_device (struct pci_dev *) { // in /drivers/pci/remove.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 263)       calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 264)       pci_destroy_dev (struct pci_dev *) {
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 265)         calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 266)         device_unregister (&dev->dev) {        // in /drivers/base/core.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 267)           calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 268)           device_del(struct device * dev) {    // in /drivers/base/core.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 269)             calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 270)             kobject_del() {                    //in /libs/kobject.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 271)               calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 272)               kobject_uevent() {               // in /libs/kobject.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 273)                 calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 274)                 kset_uevent() {                // in /lib/kobject.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 275)                   calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 276)                   kset->uevent_ops->uevent()   // which is really just
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 277)                   a call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 278)                   dev_uevent() {               // in /drivers/base/core.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 279)                     calls
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 280)                     dev->bus->uevent() which is really just a call to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 281)                     pci_uevent () {            // in drivers/pci/hotplug.c
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 282)                       which prints device name, etc....
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 283)                    }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 284)                  }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 285)                  then kobject_uevent() sends a netlink uevent to userspace
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 286)                  --> userspace uevent
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 287)                  (during early boot, nobody listens to netlink events and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 288)                  kobject_uevent() executes uevent_helper[], which runs the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 289)                  event process /sbin/hotplug)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 290)              }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 291)            }
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 292)            kobject_del() then calls sysfs_remove_dir(), which would
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 293)            trigger any user-space daemon that was watching /sysfs,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 294)            and notice the delete event.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 295) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 296) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 297) Pro's and Con's of the Current Design
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 298) -------------------------------------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 299) There are several issues with the current EEH software recovery design,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 300) which may be addressed in future revisions.  But first, note that the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 301) big plus of the current design is that no changes need to be made to
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 302) individual device drivers, so that the current design throws a wide net.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 303) The biggest negative of the design is that it potentially disturbs
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 304) network daemons and file systems that didn't need to be disturbed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 305) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 306) -  A minor complaint is that resetting the network card causes
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 307)    user-space back-to-back ifdown/ifup burps that potentially disturb
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 308)    network daemons, that didn't need to even know that the pci
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 309)    card was being rebooted.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 310) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 311) -  A more serious concern is that the same reset, for SCSI devices,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 312)    causes havoc to mounted file systems.  Scripts cannot post-facto
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 313)    unmount a file system without flushing pending buffers, but this
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 314)    is impossible, because I/O has already been stopped.  Thus,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 315)    ideally, the reset should happen at or below the block layer,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 316)    so that the file systems are not disturbed.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 317) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 318)    Reiserfs does not tolerate errors returned from the block device.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 319)    Ext3fs seems to be tolerant, retrying reads/writes until it does
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 320)    succeed. Both have been only lightly tested in this scenario.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 321) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 322)    The SCSI-generic subsystem already has built-in code for performing
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 323)    SCSI device resets, SCSI bus resets, and SCSI host-bus-adapter
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 324)    (HBA) resets.  These are cascaded into a chain of attempted
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 325)    resets if a SCSI command fails. These are completely hidden
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 326)    from the block layer.  It would be very natural to add an EEH
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 327)    reset into this chain of events.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 328) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 329) -  If a SCSI error occurs for the root device, all is lost unless
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 330)    the sysadmin had the foresight to run /bin, /sbin, /etc, /var
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 331)    and so on, out of ramdisk/tmpfs.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 332) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 333) 
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 334) Conclusions
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 335) -----------
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 336) There's forward progress ...