System does not reboot after injecting uncorrectable PCIE errors via aer-inject

CPU info :
root@node:~# cat /sys/devices/cpu/caps/pmu_name 
broadwell
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          20
On-line CPU(s) list:             0-11
Off-line CPU(s) list:            12-19
Thread(s) per core:              1
Core(s) per socket:              10
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           79
Model name:                      Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz
$uname -a
Linux smirnoff-node 5.2.60-rt15-LTS19 #1 SMP Mon Nov 21 19:33:51 PST 2022 x86_64 x86_64 x86_64 GNU/Linux

Trying and simulate and validate aer-inject functionality on Linux machine with below pcie error

 cat /root/pcie.err
### AER Inject Error file
## DEVICE: Ethernet controller: Broadcom Inc. and subsidiaries Device b045
##-----------------------------------
AER
BUS 0x4a DEV 00 FN 0
UNCOR_STATUS TRAIN
HEADER_LOG 7 1 2 5
$aer-inject pcie.err
 kernel panic logs after aer-inject :
=======
- It is getting struck and doe not reboot

pcieport 0000:25:02.0: BAR 13: failed to assign [io  size 0x1000]
perf: interrupt took too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 79000
perf: interrupt took too long (3132 > 3127), lowering kernel.perf_event_max_sample_rate to 63000
perf: interrupt took too long (3916 > 3915), lowering kernel.perf_event_max_sample_rate to 51000
pcieport 0000:00:03.1: aer_inject: Injecting errors 00000000/00000001 into device 0000:4a:00.0
pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:4a:00.0
linux-kernel-bde 0000:4a:00.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
linux-kernel-bde 0000:4a:00.0: AER:   device [14e4:b045] error status/mask=00000001/00000000
linux-kernel-bde 0000:4a:00.0: AER:    [ 0] Undefined              (First)
pcieport 0000:00:03.1: AER: Device recovery failed
kvm: exiting hardware virtualization
sd 5:0:0:0: [sdb] Synchronizing SCSI cache
sd 0:0:0:0: [sda] Synchronizing SCSI cache
reboot: Restarting system
printk: enabled sync mode
watchdog: BUG: soft lockup - CPU#10 stuck for 134s! [lcmd:10451]
sd 0:0:0:0: timing out command, waited 180s
printk: console [ttyS0]: printing thread stopped
reboot: machine restart
Modules linked in: vhost_net vhost macvtap tap xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 iptable_mangle iptable_nat ebtable_filter ebtables linux_user_bde(PO) linux_kernel_bde(PO) xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_intel kvm vfio_pci vfio_virqfd vfio_iommu_type1 vfio pci_stub uio_pci_hostif i40e(O) qfx_pci_static_map(O) macvlan socktun(O) i2c_dev uio_fpga(O) uio iTCO_wdt iTCO_vendor_support watchdog intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crct10dif_common aesni_intel aes_x86_64 glue_helper crypto_simd cryptd i2c_i801 lpc_ich igb(O) configfs pcc_cpufreq sch_fq_codel nfsd openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 irqbypass fuse [last unloaded: kvm]
CPU: 10 PID: 10451 Comm: lcmd Kdump: loaded Tainted: P           O      5.2.60-rt15-LTS19 #1
Hardware name: Juniper Networks Inc. 0CA3/0CA3, BIOS CBEP_P_VAL1_00.15.01 10/30/2018
**Shutting down cpus with NMI**

  • Observing reboot gets stuck here and needs hard power cycle of setup to recover the setup 6/10 iterations issue is reproducible.
  • Current analysis in reboot path ,observing that issue is reproduced when system goes reboot with NMI_VECTOR path

-Observed in reboot path tries to stop all active CPU's before reboot and invokes REBOOT_VECTOR irq handler to shutdown in non-working case REBOOT_VECTOR is failing and 2 cpus are still active and tries to do force shutdown using NMI_VECTOR irq .

  • Since the active CPU's are not stopped due to some locking or other , NMI_VECTOR is invoked to reboot but in this case NMI_VECTOR is failing to turn off CPU's and causing deadlock or hang while reboot .
working case when all CPU's are active and stops all
apic->send_IPI_allbutself(REBOOT_VECTOR);
/linux/v5.2.21/source/arch/x86/kernel/smp.c#L218
not-working case when two CPU's are unable to stop
apic->send_IPI_allbutself(NMI_VECTOR);
/linux/v5.2.21/source/arch/x86/kernel/smp.c#L244
  • In failure case reboot is triggered via "NMI_VECTOR" and getting reboot stuck .
  • Need help in understanding this behavior to fix the issue .
  • looking forward for your responses on this issue and please let me know if any info required .

Thank you !!