Hello, our four way Opteron 842 (Quartet MB) with 8GB of RAM running Suse 9.1 haves some problems: After some time (always different) the system hangs completely. I've attached a console to check for error messages, but there are none. The system justs hangs without complaining. Next step was to reduce the memory to 4GB, but nothing changed. I've also tried to do the memcheck from the DVD, but immediately after starting the check, an error (something like 'unexpected interrupt') occurs and the test is aborted. Does anybody have a good idea or perhaps the same problem? Until now, I don't even know if it is a software or hardware problem. Regards, Holger -- Holger Fröning (mailto:holger@ra.ti.uni-mannheim.de) University of Mannheim - Computer Architecture Group URL: http://ra.ti.uni-mannheim.de -------------------------------------------------------
--- Holger Froening <holger@mufasa.informatik.uni-mannheim.de> schrieb: > Hello,
our four way Opteron 842 (Quartet MB) with 8GB of RAM running Suse 9.1 haves some problems: After some time (always different) the system hangs completely. I've attached a console to check for error messages, but there are none. The system justs hangs without complaining. Next step was to reduce the memory to 4GB, but nothing changed. I've also tried to do the memcheck from the DVD, but immediately after starting the check, an error (something like 'unexpected interrupt') occurs and the test is aborted.
Does anybody have a good idea or perhaps the same problem? Until now, I don't even know if it is a software or hardware problem.
hello, i had a very similar problem with my amd athlon64 - randomly crashes, kernel oopses, reboots, ... but no usable error messages... in my case a bios-update solved all my stability problems. best regards Franz ___________________________________________________________ Bestellen Sie Y! DSL und erhalten Sie die AVM "FritzBox SL" für 0. Sie sparen 119 und bekommen 2 Monate Grundgebührbefreiung. http://de.adsl.yahoo.com
Does anybody have a good idea or perhaps the same problem? Until now, I don't even know if it is a software or hardware problem.
I had the same problem with respect to memtest86. Download the latest version from memtest86.com. It worked fine for me. -- ---------------------------------------------------------------- Visiting SANSFIRE?? Stop by at IPNet and say Hi ;-) http://www.sans.org/sansfire2004 ---------------------------------------------------------------- Johannes Ullrich jullrich@euclidian.com contact: http://johannes.homepc.org/contact.htm ----------------------------------------------------------------
I had mysterious "freezes" with my Dual Opteron server with 2 MB of RAM when I got it a few weeks back. The first time I actually got a Machine Check Exception. I flashed the BIOS to a much newer version but it still froze (with no messages!). I ran Memtest86+ for a FULL day and nothing showed up faulty. HOWEVER, running the Bonnie disk benchmark would cause the system to halt! Sometimes it would halt immediately, sometimes it would last a few minutes. I'd get it to crash fasters if I ran a few at once. I thought maybe it was something to do with the SLES 8 AMD64 kernel and my motherboard (TYAN 2882) or the LSI U320 scsi controller. Luckily I had an IDENTICAL server which I shifted the disks into and that server worked perfectly even when I ran 4 bonnies simultaneously (one per HDD). So I started testing each RAM stick one by one in the faulty machine and sure enough it was one faulty RAM stick. The RAM was ECC so go figure...! I wasn't impressed that the system hung with no messages but I was glad to have found the problem. A pity I had picked the wrong server to set up first! So there we go. You can have a faulty RAM stick and memtest86+ might not show up any problem. bonnie -s 2048 (for a 2GB machine) however caused the error to manifest. Cheers, Mike
Mike Tierney wrote:
So there we go. You can have a faulty RAM stick and memtest86+ might not show up any problem. bonnie -s 2048 (for a 2GB machine) however caused the error to manifest.
The same happens to me (with Tyan 2882). Not only bonnie, but even a small C program like that: while (1) { p = malloc(10*Mbyte); memset(p, 0, 10*MByte); } crashes the machine, as soon as the system starts to swap. What is the solution? BIOS update and/or memory update?
Cheers, Mike
Thanks to everybody, Davide -- +-------------------------------------------------------+ | Davide Ceresoli <ceresoli@physics.rutgers.edu> | | Dept. of Physics and Astronomy, Rutgers University | | 136 Frelinghuysen Road | | Piscataway, NJ 08854-8019 USA | | Telephone: 732-445-8299 Fax: 732-445-4343 | +-------------------------------------------------------+
I would first update your BIOS and see if it still happens. Our boards had v1.07 on them and the newest version is 2.<something>. If it still happens then test each memory chip one at a time and you might find that just one is faulty. Also check on the TYAN site and see if you make and model of RAM is compatible. Perhaps try some other make/model of (compatible) RAM in the same machine (IF you have any handy). If all that fails perhaps TYAN's forums/tech support can help. Cheers Mike -----Original Message----- From: Davide Ceresoli [mailto:ceresoli@physics.rutgers.edu] Sent: Tuesday, 29 June 2004 12:34 a.m. To: suse-amd64@suse.com Subject: Re: [suse-amd64] 4xAMD64: System is stuck Mike Tierney wrote:
So there we go. You can have a faulty RAM stick and memtest86+ might not show up any problem. bonnie -s 2048 (for a 2GB machine) however caused the error to manifest.
The same happens to me (with Tyan 2882). Not only bonnie, but even a small C program like that: while (1) { p = malloc(10*Mbyte); memset(p, 0, 10*MByte); } crashes the machine, as soon as the system starts to swap. What is the solution? BIOS update and/or memory update?
Cheers, Mike
Thanks to everybody, Davide -- +-------------------------------------------------------+ | Davide Ceresoli <ceresoli@physics.rutgers.edu> | | Dept. of Physics and Astronomy, Rutgers University | | 136 Frelinghuysen Road | | Piscataway, NJ 08854-8019 USA | | Telephone: 732-445-8299 Fax: 732-445-4343 | +-------------------------------------------------------+ -- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
participants (5)
-
Davide Ceresoli
-
Franz Mach
-
Holger Froening
-
Johannes B. Ullrich
-
Mike Tierney