Athlon X2 5600+ on nForce 500 chipset, Ubuntu and Win7 run fine, moved to SuSE
12.1 64bit recently, standard desktop-kernel.
Every once in a while it throws an error into messages that looks like this:
Jun 20 13:24:35 linux-3gig kernel: [ 2100.704050] [Hardware Error]:
MC0_STATUS[-|CE|-|-|AddrV|CECC]: 0x944ec00000000136
Jun 20 13:24:35 linux-3gig kernel: [ 2100.704070] [Hardware Error]: Data Cache
Error: during L1 linefill from L2.
Jun 20 13:24:35 linux-3gig kernel: [ 2100.704078] [Hardware Error]: cache
level: L2, tx: DATA, mem-tx: DRD
Jun 20 13:24:35 linux-3gig kernel: [ 2100.704097] [Hardware Error]: Machine
check events logged
What's strange is that it *only* happens on fractions of 50 seconds:
/var/log # grep -i cache\ error messages | cut -c1-68
Jun 20 13:19:35 linux-3gig kernel: [ 1800.701044] [Hardware Error]:
Jun 20 13:24:35 linux-3gig kernel: [ 2100.704070] [Hardware Error]:
Jun 20 13:37:05 linux-3gig kernel: [ 2850.704042] [Hardware Error]:
Jun 24 20:40:34 linux-3gig kernel: [21000.701044] [Hardware Error]:
Jun 25 20:36:40 linux-3gig kernel: [ 8100.704028] [Hardware Error]:
Jun 29 22:53:52 linux-3gig kernel: [ 1500.704022] [Hardware Error]:
Jun 29 23:16:22 linux-3gig kernel: [ 2850.704030] [Hardware Error]:
Jun 29 23:28:52 linux-3gig kernel: [ 3600.704065] [Hardware Error]:
Jun 29 23:46:22 linux-3gig kernel: [ 4650.704023] [Hardware Error]:
Jun 30 00:03:52 linux-3gig kernel: [ 5700.704028] [Hardware Error]:
Jun 30 00:11:22 linux-3gig kernel: [ 6150.704023] [Hardware Error]:
Jun 30 00:43:52 linux-3gig kernel: [ 8100.704023] [Hardware Error]:
Jul 1 16:15:06 linux-3gig kernel: [ 600.701025] [Hardware Error]:
Jul 1 17:55:06 linux-3gig kernel: [ 6600.704023] [Hardware Error]:
Jul 1 18:12:36 linux-3gig kernel: [ 7650.704037] [Hardware Error]:
Jul 2 22:47:25 linux-3gig kernel: [ 300.701022] [Hardware Error]:
Jul 2 22:52:25 linux-3gig kernel: [ 600.704029] [Hardware Error]:
Jul 6 00:41:11 linux-3gig kernel: [ 1500.701028] [Hardware Error]:
(Come to think of it this is not even regularly as then it would have to
happen on 00 and 30)
I would have thought hardware error, but not with the number scheme and in
respect that various kernels from ubuntu releases never complained.
Plus, the machine works perfectly stable from user perspective.
Here's # lspci -vvv
Regards,
Dex
00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2)
Subsystem: ASUSTeK Computer Inc. Device 8239
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
Reset- FastB2B-
PriDiscTmr- SecDiscTmr+ DiscTmrStat- DiscTmrSERREn+
Capabilities: [b8] Subsystem: nVidia Corporation Device cb84
Capabilities: [8c] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a3)
Subsystem: ASUSTeK Computer Inc. Device 8239
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
TAbort-
TAbort-
Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Subsystem: nVidia Corporation Device 0000
Capabilities: [48] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/2 Maskable- 64bit+
Address: 00000000fee0300c Data: 4129
Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
Capabilities: [80] Express (v1) Root Port (Slot+), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns,
L1 <1us
ExtTag- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr-
TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1,
Latency L0 <512ns, L1 <4us
ClockPM- Surprise- LLActRep+ BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+
DLActive+ BWMgmt- ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug-
Surprise-
Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq-
LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+
Interlock-
Changed: MRL- PresDet+ LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna-
CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=WRR32
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Kernel driver in use: pcieport
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
TAbort-
TAbort-
TAbort-
TAbort-
++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@
b++(+++) DI+++ D- G++ e* h>++ r* y?
------END GEEK CODE BLOCK------