I got a kernel BUG, I don't know why it was triggered.
[ 242.337362] kernel BUG at arch/x86/kernel/cpu/mce/core.c:1364!
[ 242.337366] invalid opcode: 0000 [#1] SMP NOPTI
This is CentOS 8.5, Kernel 4.18.0-348.el8.x86_64 on an x86_64.
The core.c line 1364 is:
nmi_exit();
(above line is inside do_machine_check()
):
By checking nmi_exit()
https://elixir.bootlin.com/linux/v4.18/source/include/linux/hardirq.h#L78
#define nmi_exit() \
do { \
trace_hardirq_exit(); \
rcu_nmi_exit(); \
BUG_ON(!in_nmi()); \
preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
ftrace_nmi_exit(); \
lockdep_on(); \
printk_nmi_exit(); \
} while (0)
It looks like I hit this BUG_ON(!in_nmi());
, but I checked do_machine_check()
, it should still in_nmi
(since line 1255 nmi_enter();
), why BUG_ON(!in_nmi());
was triggerd?
Others:
https://vault.centos.org/8.5.2111/BaseOS/Source/SPackages/kernel-4.18.0-348.el8.src.rpm
I got a kernel BUG, I don't know why it was triggered?
The specific bug you experience attempted to use an invalid opcode: 0000 [#1] SMP NOPTI.
I'll address that, its cause, and how to resolve the issue. First, I'll define some terminology.
A NMI is a hardware interrupt that is exempt from any interrupt-masking enabled by the operating system (e.g. CentOS 8.5). In nearly every situation, it is in response to non-recoverable hardware errors.
Linux has had Intel nested NMI support for as long as I remember. A vulnerability in the Intel nested NMI support was exciting in 2012. Intel has a NMI iret flaw that requires the NMI handler to avoid triggering a page fault or break-point while processing an NMI.
Support for nest NMI in ARM64 and PowerPC was committed to Linux on May 20th, 2020.
Starting in Linux 2.6 BUG_ON() is debugging macro for when something goes terribly wrong. If the value passed to the macro is true, the Linux kernel will trigger the invalid instruction. This results in the CPU throwing an invalid opcode exception. Normally if this happens in a process, the process dies. If this happens during an NMI, it's far more serious.
So in_nmi() is a check if the current preempt bit of the current NMI is set to true.
Linux uses this to disables Meltdown (Kernel Page Table Isolation) mitigations. Typically nopti is added to the kernel boot options to disable.
What can I do about it if it's the software? Most Likely
Well how about if it's the a hardware defect? Less Likely
A working theory for this issue is the NMI are being cause by an uncorrected hardware memory error and a coding error creates the circumstances that BUG_ON(!in_nmi()) is checked before the second increased had incremented the preempt_counter.
In this particular case, the original poster used a tool einj_mem_uc to general a simulated memory error. That initiates the NMI.