On Mon, Jul 28, 2003 at 02:59:02PM +0200, Andreas Jaeger wrote:
Gregor Stößer <stoesser@chemie.uka.de> writes:
Hi List,
I'm having some trouble with my Opteron Box, I runs fine for some days and then dies :-( I get the following message:
aoc2pc75 kernel: Northbridge Machine Check Exception f61b200100000813 0
Is this the whole message?
Yes, that's all I can see because the machine freezes and I have to reset it. But there is some kind of information in /var/log/messages: Jul 28 15:08:10 aoc2pc75 kernel: Northbridge status a40000000005001b Jul 28 15:08:10 aoc2pc75 kernel: GART TLB error generic level generic Jul 28 15:08:10 aoc2pc75 kernel: extended error gart error Jul 28 15:08:10 aoc2pc75 kernel: link number 0 Jul 28 15:08:10 aoc2pc75 kernel: error address valid Jul 28 15:08:10 aoc2pc75 kernel: error uncorrected Jul 28 15:08:10 aoc2pc75 kernel: previous error lost Jul 28 15:08:10 aoc2pc75 kernel: error address 0000000009470de8 This message appears every 90 seconds.
Kernel is 2.4.19 as shipped with SuSE 8.2 x86_64.
Ok, that one has a fixed machine check exception handling and if you get one, it indicates a hardware problem in general.
Sounds bad ... I'll try a newer kernel and turn machine check exception off, maybe it runs stable then.
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Thanks Gregor -- -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Gregor Stößer email: gregor.stoesser@chemie.uni-karlsruhe.de Institut für Anorganische Chemie Universität Karlsruhe Tel: 0721/608 2988 Fax: 0721/608 4854 *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-