Mailinglist Archive: opensuse (795 mails)

< Previous Next >
[opensuse] Re: mcelog: CPU 6 on socket 1 received Bus and Interconnect Errors in Other-transaction
  • From: mh@xxxxxxxxxxxxxxx (Michael Hirmke)
  • Date: 01 Nov 2017 12:22:00 +0100
  • Message-id: <EC05Wwf$pfB@mike.franken.de>
push

Hi *,

on my main server after a cold boot I see the following messages in my
journal:

...
kernel: mce: [Hardware Error]: CPU 6: Machine Check: 0 Bank 20:
c8012a4000200e0f kernel: mce: [Hardware Error]: TSC 0 mce: MISC 800000 mce:
kernel: mce: [Hardware Error]: PROCESSOR 0:306f2 TIME 1508767394 SOCKET 1
APIC 10 microcode 36 mcelog[2635]: CPU 6 on socket 1 received Bus and
Interconnect Errors in Other-transaction mcelog[2636]: Location: CPU 6 on
socket 1 ...
systemd[1]: Starting Machine Check Exception Logging Daemon...
systemd[1]: Started Machine Check Exception Logging Daemon.
mcelog[2628]: Hardware event. This is not a software error.
mcelog[2628]: MCE 0
mcelog[2628]: CPU 6 BANK 20
mcelog[2628]: MISC 800000
mcelog[2628]: TIME 1508767394 Mon Oct 23 16:03:14 2017
mcelog[2628]: MCG status:
mcelog[2628]: MCi status:
mcelog[2628]: Error overflow
mcelog[2628]: Corrected error
mcelog[2628]: MCi_MISC register valid
mcelog[2628]: MCA: BUS error: 1 6 Level-3 Generic Generic Other-transaction
Request-did-not-timeout mcelog[2628]: Running trigger `bus-error-trigger'
mcelog[2628]: QPI:
mcelog[2628]: Intel QPI physical layer detected a QPI in-band reset but
aborted initialization mcelog[2628]: STATUS c8012a4000200e0f MCGSTATUS 0
mcelog[2628]: MCGCAP 7000c16 APICID 10 SOCKETID 1
mcelog[2628]: CPUID Vendor Intel Family 6 Model 63
mcelog[2628]: <27>Oct 23 16:04:47 mcelog: CPU 6 on socket 1 received Bus and
Interconnect Errors in Other-transaction mcelog[2628]: <27>Oct 23 16:04:47
mcelog: Location: CPU 6 on socket 1 ...

This is with kernel 4.4.90-28 on openSuSE Leap 42.3, but after checking
older journal entries I saw, that it also happened with 4.4.87-25.
Machine specs:

- Supermicro X10DRi/X10DRi, BIOS 2.0 12/28/2015
- 2 x 6 core CPU Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
- 128 GB RAM

The error does not happen when restarting the os, only after a cold boot
of the machine.

I couldn't find appropriate information on the net.
Is cpu 1 damaged?
Can I do anything to correct the problem - or just ignore it?

--
Michael Hirmke

--
Michael Hirmke

--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups