Hi List, I'm having some trouble with my Opteron Box, I runs fine for some days and then dies :-( I get the following message: aoc2pc75 kernel: Northbridge Machine Check Exception f61b200100000813 0 Kernel is 2.4.19 as shipped with SuSE 8.2 x86_64. I'm using a MSI K8D Master board with two Opteron 240 Processors, 6 GB RAM, 1 IDE and 3 SCSI Disks, SCSI controller is a Symbios Logic 53c895. Any hints? Gregor -- -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Gregor Stößer email: gregor.stoesser@chemie.uni-karlsruhe.de Institut für Anorganische Chemie Universität Karlsruhe Tel: 0721/608 2988 Fax: 0721/608 4854 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
Hate to say it - possibly a mainboard/memory instability rather than a kernel based upon the success everyone else is seeing. Andrew Cotterill Senior Technical Engineer http://www.epox.org Please provide original Text if replying. _____ CONFIDENTIALITY NOTICE: This Email is confidential and may also be privileged. If you are not the intended recipient, please notify the sender IMMEDIATELY; you should not copy the email or use it for any purpose or disclose its contents to any other person. GENERAL STATEMENT: Any statements made, or intentions expressed in this communication may not necessarily reflect the view of Actron Electronics. Be advised that no content herein may be held binding upon Actron Electronics or any associated company unless confirmed by the issuance of a formal contractual document or Purchase Order.E&OE. All specifications subject to change without prior notice. _____ -----Original Message----- From: Gregor Stößer [mailto:stoesser@chemie.uka.de] Sent: 28 July 2003 13:49 To: suse-amd64@suse.com Subject: [suse-amd64] kernel trouble Hi List, I'm having some trouble with my Opteron Box, I runs fine for some days and then dies :-( I get the following message: aoc2pc75 kernel: Northbridge Machine Check Exception f61b200100000813 0 Kernel is 2.4.19 as shipped with SuSE 8.2 x86_64. I'm using a MSI K8D Master board with two Opteron 240 Processors, 6 GB RAM, 1 IDE and 3 SCSI Disks, SCSI controller is a Symbios Logic 53c895. Any hints? Gregor -- -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Gregor Stößer email: gregor.stoesser@chemie.uni-karlsruhe.de Institut für Anorganische Chemie Universität Karlsruhe Tel: 0721/608 2988 Fax: 0721/608 4854 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
Gregor Stößer <stoesser@chemie.uka.de> writes:
Hi List,
I'm having some trouble with my Opteron Box, I runs fine for some days and then dies :-( I get the following message:
aoc2pc75 kernel: Northbridge Machine Check Exception f61b200100000813 0
Is this the whole message?
Kernel is 2.4.19 as shipped with SuSE 8.2 x86_64.
Ok, that one has a fixed machine check exception handling and if you get one, it indicates a hardware problem in general. Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On Mon, Jul 28, 2003 at 02:59:02PM +0200, Andreas Jaeger wrote:
Gregor Stößer <stoesser@chemie.uka.de> writes:
Hi List,
I'm having some trouble with my Opteron Box, I runs fine for some days and then dies :-( I get the following message:
aoc2pc75 kernel: Northbridge Machine Check Exception f61b200100000813 0
Is this the whole message?
Yes, that's all I can see because the machine freezes and I have to reset it. But there is some kind of information in /var/log/messages: Jul 28 15:08:10 aoc2pc75 kernel: Northbridge status a40000000005001b Jul 28 15:08:10 aoc2pc75 kernel: GART TLB error generic level generic Jul 28 15:08:10 aoc2pc75 kernel: extended error gart error Jul 28 15:08:10 aoc2pc75 kernel: link number 0 Jul 28 15:08:10 aoc2pc75 kernel: error address valid Jul 28 15:08:10 aoc2pc75 kernel: error uncorrected Jul 28 15:08:10 aoc2pc75 kernel: previous error lost Jul 28 15:08:10 aoc2pc75 kernel: error address 0000000009470de8 This message appears every 90 seconds.
Kernel is 2.4.19 as shipped with SuSE 8.2 x86_64.
Ok, that one has a fixed machine check exception handling and if you get one, it indicates a hardware problem in general.
Sounds bad ... I'll try a newer kernel and turn machine check exception off, maybe it runs stable then.
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Thanks Gregor -- -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Gregor Stößer email: gregor.stoesser@chemie.uni-karlsruhe.de Institut für Anorganische Chemie Universität Karlsruhe Tel: 0721/608 2988 Fax: 0721/608 4854 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
The messages are merely an event log of the error of the hardware (MSI), not of the kernel. Andrew Cotterill Senior Technical Engineer http://www.epox.org Please provide original Text if replying. _____ CONFIDENTIALITY NOTICE: This Email is confidential and may also be privileged. If you are not the intended recipient, please notify the sender IMMEDIATELY; you should not copy the email or use it for any purpose or disclose its contents to any other person. GENERAL STATEMENT: Any statements made, or intentions expressed in this communication may not necessarily reflect the view of Actron Electronics. Be advised that no content herein may be held binding upon Actron Electronics or any associated company unless confirmed by the issuance of a formal contractual document or Purchase Order.E&OE. All specifications subject to change without prior notice. _____ -----Original Message----- From: Gregor Stößer [mailto:stoesser@chemie.uka.de] Sent: 28 July 2003 14:17 To: Andreas Jaeger Cc: suse-amd64@suse.com Subject: Re: [suse-amd64] kernel trouble On Mon, Jul 28, 2003 at 02:59:02PM +0200, Andreas Jaeger wrote:
Gregor Stößer <stoesser@chemie.uka.de> writes:
Hi List,
I'm having some trouble with my Opteron Box, I runs fine for some days and then dies :-( I get the following message:
aoc2pc75 kernel: Northbridge Machine Check Exception f61b200100000813 0
Is this the whole message?
Yes, that's all I can see because the machine freezes and I have to reset it. But there is some kind of information in /var/log/messages: Jul 28 15:08:10 aoc2pc75 kernel: Northbridge status a40000000005001b Jul 28 15:08:10 aoc2pc75 kernel: GART TLB error generic level generic Jul 28 15:08:10 aoc2pc75 kernel: extended error gart error Jul 28 15:08:10 aoc2pc75 kernel: link number 0 Jul 28 15:08:10 aoc2pc75 kernel: error address valid Jul 28 15:08:10 aoc2pc75 kernel: error uncorrected Jul 28 15:08:10 aoc2pc75 kernel: previous error lost Jul 28 15:08:10 aoc2pc75 kernel: error address 0000000009470de8 This message appears every 90 seconds.
Kernel is 2.4.19 as shipped with SuSE 8.2 x86_64.
Ok, that one has a fixed machine check exception handling and if you get one, it indicates a hardware problem in general.
Sounds bad ... I'll try a newer kernel and turn machine check exception off, maybe it runs stable then.
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272
A126 Thanks Gregor -- -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Gregor Stößer email: gregor.stoesser@chemie.uni-karlsruhe.de Institut für Anorganische Chemie Universität Karlsruhe Tel: 0721/608 2988 Fax: 0721/608 4854 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- -- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
participants (3)
-
Andreas Jaeger
-
Andrew Cotterill (EPoX UK)
-
Gregor Stößer