Dual-Opteron w/ 8GB RAM boot problem
I installed SuSE 9.0 for x86_64 on the following box but I'm having boot problems when the machine is loaded with 8GB of RAM: Dual Opteron 248 8GB RAM (in 8 1GB sticks) Tyan K8W (S2885) - BIOS Rev 1.02 Adaptec 29320 If I take out half the RAM so that I only have 4GB total in the system, everything works fine. However, if the full 8GB are in I get somewhat random errors (memory corruption?): Run 1: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem). linuxrc[12]: segfault at 00000000ffffff99 rip 000000000040c428 rsp 0000007fbffff860 error 6 VFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------ Run 2: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 crc errorVFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------ Run 3: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 NMI Watchdog detected LOCKUP on CPU1, eip ffffffff8012423c, registers: CPU 1 Pid: 1, comm: swapper Not tainted RIP: 0010:[<ffffffff8012423c>[{.text.lock.fork+27} RSP: 0000:00000100ca8f1d58 EFLAGS: 00000086 RAX: 0000000000000000 RBX: 00000101ffd11e38 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000100ca8f1d68 RDI: 00000101ffd11e40 RBP: 00000100ca8f0000 R08: ffffffff803c7b00 R09: 00000101ffd12e00 R10: 0000000000000000 R11: 0000000080000000 R12: 00000101ffd11e40 R13: 00000100ca8f1d68 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ca902000 CR4: 00000000000006e0 Process swapper (pid: 1, stackpage=100ca8f1000) Stack: 00000100ca8f1d58 0000000000000000 Call Trace: Code: f3 90 7f f9 e9 e7 e5 ff ff 80 3f 00 f3 90 7e f9 e9 18 e6 ff console shuts up ... ------------------------------------------------------------------------ I've tried installing on a Rioworks Arima HDAMA machine with 8GB RAM and the same SCSI card, and it worked fine. So I'm thinking that there's a driver issue with the Tyan board. BTW, I have a sister machine with the same Tyan board, and it also experiences the same problem. Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196
I've got the HDAMA and the tyan 2885 as well. The HDAMA works perfectly with 8gb, but I had similar problems with the tyan. My raid card partitions would not mount properly with greater than 4gb ram in the tyan. I have an LSI logic raid card. I ended up moving the ram back to the HDAMA to solve the problem. However, tyan does have another bios that I have not tried. See question 2 here: http://www.tyan.com/support/html/f_s2885.html Mark Bryan Stillwell wrote:
I installed SuSE 9.0 for x86_64 on the following box but I'm having boot problems when the machine is loaded with 8GB of RAM:
Dual Opteron 248 8GB RAM (in 8 1GB sticks) Tyan K8W (S2885) - BIOS Rev 1.02 Adaptec 29320
If I take out half the RAM so that I only have 4GB total in the system, everything works fine. However, if the full 8GB are in I get somewhat random errors (memory corruption?):
Run 1: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem). linuxrc[12]: segfault at 00000000ffffff99 rip 000000000040c428 rsp 0000007fbffff860 error 6 VFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 2: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 crc errorVFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 3: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 NMI Watchdog detected LOCKUP on CPU1, eip ffffffff8012423c, registers: CPU 1 Pid: 1, comm: swapper Not tainted RIP: 0010:[<ffffffff8012423c>[{.text.lock.fork+27} RSP: 0000:00000100ca8f1d58 EFLAGS: 00000086 RAX: 0000000000000000 RBX: 00000101ffd11e38 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000100ca8f1d68 RDI: 00000101ffd11e40 RBP: 00000100ca8f0000 R08: ffffffff803c7b00 R09: 00000101ffd12e00 R10: 0000000000000000 R11: 0000000080000000 R12: 00000101ffd11e40 R13: 00000100ca8f1d68 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ca902000 CR4: 00000000000006e0 Process swapper (pid: 1, stackpage=100ca8f1000) Stack: 00000100ca8f1d58 0000000000000000 Call Trace:
Code: f3 90 7f f9 e9 e7 e5 ff ff 80 3f 00 f3 90 7e f9 e9 18 e6 ff console shuts up ... ------------------------------------------------------------------------
I've tried installing on a Rioworks Arima HDAMA machine with 8GB RAM and the same SCSI card, and it worked fine. So I'm thinking that there's a driver issue with the Tyan board. BTW, I have a sister machine with the same Tyan board, and it also experiences the same problem.
Bryan
I tried out the beta bios that you mentioned, but it didn't help the booting problems with 8GB ram. However, I did find that if I pass "maxcpus=0" to the kernel, it'll boot with all 8GB of memory in the machine. Of course this only leaves you with one CPU working, which is just as bad as the other solution of only having half the memory... Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196 On Fri, Apr 30, 2004 at 10:50:17AM -0400, Mark Horton wrote:
I've got the HDAMA and the tyan 2885 as well. The HDAMA works perfectly with 8gb, but I had similar problems with the tyan. My raid card partitions would not mount properly with greater than 4gb ram in the tyan. I have an LSI logic raid card. I ended up moving the ram back to the HDAMA to solve the problem. However, tyan does have another bios that I have not tried. See question 2 here: http://www.tyan.com/support/html/f_s2885.html
Mark
Bryan Stillwell wrote:
I installed SuSE 9.0 for x86_64 on the following box but I'm having boot problems when the machine is loaded with 8GB of RAM:
Dual Opteron 248 8GB RAM (in 8 1GB sticks) Tyan K8W (S2885) - BIOS Rev 1.02 Adaptec 29320
If I take out half the RAM so that I only have 4GB total in the system, everything works fine. However, if the full 8GB are in I get somewhat random errors (memory corruption?):
Run 1: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem). linuxrc[12]: segfault at 00000000ffffff99 rip 000000000040c428 rsp 0000007fbffff860 error 6 VFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 2: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 crc errorVFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 3: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 NMI Watchdog detected LOCKUP on CPU1, eip ffffffff8012423c, registers: CPU 1 Pid: 1, comm: swapper Not tainted RIP: 0010:[<ffffffff8012423c>[{.text.lock.fork+27} RSP: 0000:00000100ca8f1d58 EFLAGS: 00000086 RAX: 0000000000000000 RBX: 00000101ffd11e38 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000100ca8f1d68 RDI: 00000101ffd11e40 RBP: 00000100ca8f0000 R08: ffffffff803c7b00 R09: 00000101ffd12e00 R10: 0000000000000000 R11: 0000000080000000 R12: 00000101ffd11e40 R13: 00000100ca8f1d68 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ca902000 CR4: 00000000000006e0 Process swapper (pid: 1, stackpage=100ca8f1000) Stack: 00000100ca8f1d58 0000000000000000 Call Trace:
Code: f3 90 7f f9 e9 e7 e5 ff ff 80 3f 00 f3 90 7e f9 e9 18 e6 ff console shuts up ... ------------------------------------------------------------------------
I've tried installing on a Rioworks Arima HDAMA machine with 8GB RAM and the same SCSI card, and it worked fine. So I'm thinking that there's a driver issue with the Tyan board. BTW, I have a sister machine with the same Tyan board, and it also experiences the same problem.
Bryan
What kernel are you running ? There was a problem with the MTRR code in
the 2.4.21-120 kernel you get if you just load from the CDs or DVD that
caused driver problems if you have more than 4 GB. I have several S2885s
with 8 GB of memory that have been up for months now. They are running the
2.4.21-193-smp kernel and Nvidia 5332 graphics driver. I had a beta BIOS
at first, but the last ones that I built are running the v102 BIOS that
fixes the MTRR layout that lets the 193 and later kernels and Nvidia
drivers work correctly together... I do have one running the v102 BIOS and
2.4.21-201-smp kernel that has been stable, but it gets booted between
Windows, RedHat 9, and SuSE 9.0 so much that I can't tell anything about
long-term stability....
To get things going when loading from the CD, I had to remove memory to get
to 4 GB or less, flash the BIOS, install from CDs, upgrade the system via
YOU, then put the memory back in. From there, I could install the Nvidia
drivers, and go....
Kevin Gassiot
Advanced Systems Group
Visualization Systems Support
Veritas DGC
10300 Town Park Dr.
Houston, Texas 77072
832-351-8978
kevin_gassiot@veritasdgc.com
Bryan Stillwell
I've got the HDAMA and the tyan 2885 as well. The HDAMA works perfectly with 8gb, but I had similar problems with the tyan. My raid card partitions would not mount properly with greater than 4gb ram in the tyan. I have an LSI logic raid card. I ended up moving the ram back to the HDAMA to solve the problem. However, tyan does have another bios that I have not tried. See question 2 here: http://www.tyan.com/support/html/f_s2885.html
Mark
Bryan Stillwell wrote:
I installed SuSE 9.0 for x86_64 on the following box but I'm having boot problems when the machine is loaded with 8GB of RAM:
Dual Opteron 248 8GB RAM (in 8 1GB sticks) Tyan K8W (S2885) - BIOS Rev 1.02 Adaptec 29320
If I take out half the RAM so that I only have 4GB total in the system, everything works fine. However, if the full 8GB are in I get somewhat random errors (memory corruption?):
Run 1: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem). linuxrc[12]: segfault at 00000000ffffff99 rip 000000000040c428 rsp 0000007fbffff860 error 6 VFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 2: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 crc errorVFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 3: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 NMI Watchdog detected LOCKUP on CPU1, eip ffffffff8012423c, registers: CPU 1 Pid: 1, comm: swapper Not tainted RIP: 0010:[<ffffffff8012423c>[{.text.lock.fork+27} RSP: 0000:00000100ca8f1d58 EFLAGS: 00000086 RAX: 0000000000000000 RBX: 00000101ffd11e38 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000100ca8f1d68 RDI: 00000101ffd11e40 RBP: 00000100ca8f0000 R08: ffffffff803c7b00 R09: 00000101ffd12e00 R10: 0000000000000000 R11: 0000000080000000 R12: 00000101ffd11e40 R13: 00000100ca8f1d68 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ca902000 CR4: 00000000000006e0 Process swapper (pid: 1, stackpage=100ca8f1000) Stack: 00000100ca8f1d58 0000000000000000 Call Trace:
Code: f3 90 7f f9 e9 e7 e5 ff ff 80 3f 00 f3 90 7e f9 e9 18 e6 ff console shuts up ... ------------------------------------------------------------------------
I've tried installing on a Rioworks Arima HDAMA machine with 8GB RAM and the same SCSI card, and it worked fine. So I'm thinking that there's a driver issue with the Tyan board. BTW, I have a sister machine with the same Tyan board, and it also experiences the same problem.
Bryan
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
Kevin_Gassiot@veritasdgc.com wrote:
What kernel are you running ? There was a problem with the MTRR code in the 2.4.21-120 kernel you get if you just load from the CDs or DVD that caused driver problems if you have more than 4 GB. I have several S2885s with 8 GB of memory that have been up for months now. They are running the 2.4.21-193-smp kernel and Nvidia 5332 graphics driver. I had a beta BIOS at first, but the last ones that I built are running the v102 BIOS that fixes the MTRR layout that lets the 193 and later kernels and Nvidia drivers work correctly together... I do have one running the v102 BIOS and 2.4.21-201-smp kernel that has been stable, but it gets booted between Windows, RedHat 9, and SuSE 9.0 so much that I can't tell anything about long-term stability....
To get things going when loading from the CD, I had to remove memory to get to 4 GB or less, flash the BIOS, install from CDs, upgrade the system via YOU, then put the memory back in. From there, I could install the Nvidia drivers, and go....
Kevin Gassiot Advanced Systems Group Visualization Systems Support
Veritas DGC 10300 Town Park Dr. Houston, Texas 77072 832-351-8978 kevin_gassiot@veritasdgc.com
Bryan Stillwell
To Mark Horton 05/03/2004 01:17 cc PM suse-amd64@suse.com Subject Re: [suse-amd64] Dual-Opteron w/ 8GB RAM boot problem I tried out the beta bios that you mentioned, but it didn't help the booting problems with 8GB ram. However, I did find that if I pass "maxcpus=0" to the kernel, it'll boot with all 8GB of memory in the machine. Of course this only leaves you with one CPU working, which is just as bad as the other solution of only having half the memory...
Bryan
-- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196
On Fri, Apr 30, 2004 at 10:50:17AM -0400, Mark Horton wrote:
I've got the HDAMA and the tyan 2885 as well. The HDAMA works perfectly with 8gb, but I had similar problems with the tyan. My raid card partitions would not mount properly with greater than 4gb ram in the tyan. I have an LSI logic raid card. I ended up moving the ram back to the HDAMA to solve the problem. However, tyan does have another bios that I have not tried. See question 2 here: http://www.tyan.com/support/html/f_s2885.html
Mark
Bryan Stillwell wrote:
I installed SuSE 9.0 for x86_64 on the following box but I'm having boot problems when the machine is loaded with 8GB of RAM:
Dual Opteron 248 8GB RAM (in 8 1GB sticks) Tyan K8W (S2885) - BIOS Rev 1.02 Adaptec 29320
If I take out half the RAM so that I only have 4GB total in the system, everything works fine. However, if the full 8GB are in I get somewhat random errors (memory corruption?):
Run 1: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem). linuxrc[12]: segfault at 00000000ffffff99 rip 000000000040c428 rsp 0000007fbffff860 error 6 VFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 2: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 crc errorVFS: Cannot open root device "sda2" or 08:02 Please append a correct "root=" boot option Kernel panic: VFS: Unable to mount root fs on 08:02 ------------------------------------------------------------------------
Run 3: ------------------------------------------------------------------------ RAMDISK: Compressed image found at block 0 NMI Watchdog detected LOCKUP on CPU1, eip ffffffff8012423c, registers: CPU 1 Pid: 1, comm: swapper Not tainted RIP: 0010:[<ffffffff8012423c>[{.text.lock.fork+27} RSP: 0000:00000100ca8f1d58 EFLAGS: 00000086 RAX: 0000000000000000 RBX: 00000101ffd11e38 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000100ca8f1d68 RDI: 00000101ffd11e40 RBP: 00000100ca8f0000 R08: ffffffff803c7b00 R09: 00000101ffd12e00 R10: 0000000000000000 R11: 0000000080000000 R12: 00000101ffd11e40 R13: 00000100ca8f1d68 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff804bbb00(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ca902000 CR4: 00000000000006e0 Process swapper (pid: 1, stackpage=100ca8f1000) Stack: 00000100ca8f1d58 0000000000000000 Call Trace:
Code: f3 90 7f f9 e9 e7 e5 ff ff 80 3f 00 f3 90 7e f9 e9 18 e6 ff console shuts up ... ------------------------------------------------------------------------
I've tried installing on a Rioworks Arima HDAMA machine with 8GB RAM and the same SCSI card, and it worked fine. So I'm thinking that there's a driver issue with the Tyan board. BTW, I have a sister machine with the same Tyan board, and it also experiences the same problem.
Bryan
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
I had that same problem on my last 32-bit SuSE 8.2 install, I had to remove RAM to get down to 1 GB for the installer to work, then put it back in & it runs great now. I posted something about that to 1 of the other SuSE lists, don't remember which one, but never heard anything back .... looks like a bit of a pattern here w/ the installation process, 32 bit or 64-bit ....
I was running 2.4.21-149-smp, but now I'm using 2.4.21-211-smp after running YOU (I'm somewhat new to SuSE, so I didn't know about it). The problem is still there though... :( I'm beginning to think there's something tied in with the SCSI subsystem (it died in scsi_do_req_Rsmp once). Are you using any SCSI cards or just IDE? The weird thing is it worked in the Rioworks board using the same SCSI card... Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196 On Mon, May 03, 2004 at 02:22:55PM -0500, Kevin_Gassiot@veritasdgc.com wrote:
What kernel are you running ? There was a problem with the MTRR code in the 2.4.21-120 kernel you get if you just load from the CDs or DVD that caused driver problems if you have more than 4 GB. I have several S2885s with 8 GB of memory that have been up for months now. They are running the 2.4.21-193-smp kernel and Nvidia 5332 graphics driver. I had a beta BIOS at first, but the last ones that I built are running the v102 BIOS that fixes the MTRR layout that lets the 193 and later kernels and Nvidia drivers work correctly together... I do have one running the v102 BIOS and 2.4.21-201-smp kernel that has been stable, but it gets booted between Windows, RedHat 9, and SuSE 9.0 so much that I can't tell anything about long-term stability....
To get things going when loading from the CD, I had to remove memory to get to 4 GB or less, flash the BIOS, install from CDs, upgrade the system via YOU, then put the memory back in. From there, I could install the Nvidia drivers, and go....
On Mon, May 03, 2004 at 02:32:49PM -0600, Bryan Stillwell wrote:
I was running 2.4.21-149-smp, but now I'm using 2.4.21-211-smp after running YOU (I'm somewhat new to SuSE, so I didn't know about it). The problem is still there though... :( I'm beginning to think there's something tied in with the SCSI subsystem (it died in scsi_do_req_Rsmp once). Are you using any SCSI cards or just IDE?
The weird thing is it worked in the Rioworks board using the same SCSI card...
I would definitely try a bios update. -Andi
On Mon, May 03, 2004 at 10:36:33PM +0200, Andi Kleen wrote:
On Mon, May 03, 2004 at 02:32:49PM -0600, Bryan Stillwell wrote:
I was running 2.4.21-149-smp, but now I'm using 2.4.21-211-smp after running YOU (I'm somewhat new to SuSE, so I didn't know about it). The problem is still there though... :( I'm beginning to think there's something tied in with the SCSI subsystem (it died in scsi_do_req_Rsmp once). Are you using any SCSI cards or just IDE?
The weird thing is it worked in the Rioworks board using the same SCSI card...
I would definitely try a bios update.
I've tried the latest BIOS (1.02) and even Tyan's beta BIOS referenced in their FAQ, but neither has worked. I even tried removing the SCSI card and installed on an IDE disk, but still no luck. BTW, I've also tried both machines with the latest Fedora (Core2test3) and Mandrake (10.0rc1) without success. I was just comparing the Rioworks HDAMA motherboard specs to the Tyan K8W specs, and they're somewhat close to the same (both use AMD chipsets at least): Both have the AMD-8111 I/O Hub Both have the AMD-8131 PCI-X Tunnel The Tyan has an AMD-8151 AGP 3.0 Graphics Tunnel The Tyan has an TI TSB43AB22 IEEE 1394a PCI controller (firewire) The Rioworks has 2 Broadcom 5702 Gigabit Ethernet controllers The Tyan has 1 Broadcom BCM5703C GbE controller The Rioworks has a Super I/O PC87360 The Tyan has a Winbond W83627HF Super I/O ASIC The Rioworks has a Phoenix BIOS The Tyan has a AMI BIOS Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196
On Mon, May 03, 2004 at 04:56:54PM -0600, Bryan Stillwell wrote:
On Mon, May 03, 2004 at 10:36:33PM +0200, Andi Kleen wrote:
On Mon, May 03, 2004 at 02:32:49PM -0600, Bryan Stillwell wrote:
I was running 2.4.21-149-smp, but now I'm using 2.4.21-211-smp after running YOU (I'm somewhat new to SuSE, so I didn't know about it). The problem is still there though... :( I'm beginning to think there's something tied in with the SCSI subsystem (it died in scsi_do_req_Rsmp once). Are you using any SCSI cards or just IDE?
The weird thing is it worked in the Rioworks board using the same SCSI card...
I would definitely try a bios update.
I've tried the latest BIOS (1.02) and even Tyan's beta BIOS referenced in their FAQ, but neither has worked. I even tried removing the SCSI card and installed on an IDE disk, but still no luck.
Did you rule out broken memory too? Does it survive 48h of memtest86? There seem to be a few problems with 2.6, but in 2.4 as far as I know near all such corruption problems were related to broken hardware/BIOS of some sort or a particular binary only graphics driver. -Andi
On Tue, May 04, 2004 at 03:11:01AM +0200, Andi Kleen wrote:
On Mon, May 03, 2004 at 04:56:54PM -0600, Bryan Stillwell wrote:
On Mon, May 03, 2004 at 10:36:33PM +0200, Andi Kleen wrote:
On Mon, May 03, 2004 at 02:32:49PM -0600, Bryan Stillwell wrote:
I was running 2.4.21-149-smp, but now I'm using 2.4.21-211-smp after running YOU (I'm somewhat new to SuSE, so I didn't know about it). The problem is still there though... :( I'm beginning to think there's something tied in with the SCSI subsystem (it died in scsi_do_req_Rsmp once). Are you using any SCSI cards or just IDE?
The weird thing is it worked in the Rioworks board using the same SCSI card...
I would definitely try a bios update.
I've tried the latest BIOS (1.02) and even Tyan's beta BIOS referenced in their FAQ, but neither has worked. I even tried removing the SCSI card and installed on an IDE disk, but still no luck.
Did you rule out broken memory too? Does it survive 48h of memtest86?
There seem to be a few problems with 2.6, but in 2.4 as far as I know near all such corruption problems were related to broken hardware/BIOS of some sort or a particular binary only graphics driver.
I thought I had ruled out broken memory, but I guess I wasn't counting on both motherboards having memory problems. Some errors in memtest86 convinced me to try a third machine out, and with the latest SuSE kernel it worked fine (the stock kernel didn't support the agp chipset completely I guess). After some more testing, I've determined that one of the original computers had bad memory sockets, and the other just had bad memory. Thanks everyone for the help, Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196
9.1 Rocks! Just had to get that in for those worried about the Novell acquisition. As an example, LVM 2.0 lets you throw in a hot-swap drive and expand any partition you like ON THE FLY! Now for my question - When I try to install VMWare Workstation 4.0.5, I get an error when it trys to patch the kernel and asks for the header files of something like "file size not the same as your current kernel". I made sure my /usr/src/ files were the same as my kernel (2.4.6-53.3 I think). I then renamed asm-amd64 to asm and got the same thing. Anybody have any ideas. Techman
What kernel are you running ? There was a problem with the MTRR code in the 2.4.21-120 kernel you get if you just load from the CDs or DVD that caused driver problems if you have more than 4 GB. I have several S2885s with 8 GB of memory that have been up for months now. They are running
We did see some problems on Opterons with the 193 kernel in our Singapore
office using an Adaptec 29160 SCSI card. It seemed to be tied to a newer
version of the Adaptec driver that SuSE was using compared to an older
driver on some P4 systems running RedHat. The new driver had some new SCSI
functionality (I think what was causing us problems was Domain Validation
and the attached RAID box did not support Domain Validation). This
functionality was not in older versions of the Adaptec driver, so the
problem did not show up.
In /var/log/messages, you would see the following info when the kernel
loaded the driver for the card...
Mar 5 15:29:16 ppc003 kernel: PCI: Enabling device 02:01.0 (0015 -> 0017)
Mar 5 15:29:16 ppc003 kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI
HBA DRIVER, Rev 6.2.36
Mar 5 15:29:16 ppc003 kernel:
2.4.21-193-smp kernel and Nvidia 5332 graphics driver. I had a beta BIOS at first, but the last ones that I built are running the v102 BIOS that fixes the MTRR layout that lets the 193 and later kernels and Nvidia drivers work correctly together... I do have one running the v102 BIOS and 2.4.21-201-smp kernel that has been stable, but it gets booted between Windows, RedHat 9, and SuSE 9.0 so much that I can't tell anything about long-term stability....
To get things going when loading from the CD, I had to remove memory to get to 4 GB or less, flash the BIOS, install from CDs, upgrade the system via YOU, then put the memory back in. From there, I could install the Nvidia drivers, and go....
On Mon, May 03, 2004 at 04:18:49PM -0500, Kevin_Gassiot@veritasdgc.com wrote:
We did see some problems on Opterons with the 193 kernel in our Singapore office using an Adaptec 29160 SCSI card. It seemed to be tied to a newer version of the Adaptec driver that SuSE was using compared to an older driver on some P4 systems running RedHat. The new driver had some new SCSI functionality (I think what was causing us problems was Domain Validation and the attached RAID box did not support Domain Validation). This functionality was not in older versions of the Adaptec driver, so the problem did not show up.
I've narrowed the problem to not be related to the SCSI card at all by taking it out completely and trying to install on an IDE disk. The problem still happened even with the latest suse kernel (2.4.21-211-smp). Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196
I think I figured out a little bit on this - at least for my system. When I had the 8gb working on the HDAMA I had the root filesystem on an IDE drive. When I tried the 8gb on the tyan 2885 I had root on a Lsilogic Megaraid scsi card. I was using the 'megaraid' kernel module and it would not mount the filesystem. I got a kernel panic. I just switched to the 'megaraid2' module and all is well. It loads the module, mounts root, and boots up fine now. Mark Bryan Stillwell wrote:
On Mon, May 03, 2004 at 04:18:49PM -0500, Kevin_Gassiot@veritasdgc.com wrote:
We did see some problems on Opterons with the 193 kernel in our Singapore office using an Adaptec 29160 SCSI card. It seemed to be tied to a newer version of the Adaptec driver that SuSE was using compared to an older driver on some P4 systems running RedHat. The new driver had some new SCSI functionality (I think what was causing us problems was Domain Validation and the attached RAID box did not support Domain Validation). This functionality was not in older versions of the Adaptec driver, so the problem did not show up.
I've narrowed the problem to not be related to the SCSI card at all by taking it out completely and trying to install on an IDE disk. The problem still happened even with the latest suse kernel (2.4.21-211-smp).
Bryan
Since you have ruled out lots of other things, one other question. What
kind of memory are you using ? If you are using 400 MHz, I would try
putting 333 MHz sticks in to see if that clears anything up. There have
been a lot of problems with 400 MHz memory. Even though it should work in
theory, I have heard of a lot problems related to signal strength and line
length to the DIMM slots. Something else to try :)
Kevin
Kevin Gassiot
Advanced Systems Group
Visualization Systems Support
Veritas DGC
10300 Town Park Dr.
Houston, Texas 77072
832-351-8978
kevin_gassiot@veritasdgc.com
Bryan Stillwell
We did see some problems on Opterons with the 193 kernel in our Singapore office using an Adaptec 29160 SCSI card. It seemed to be tied to a newer version of the Adaptec driver that SuSE was using compared to an older driver on some P4 systems running RedHat. The new driver had some new SCSI functionality (I think what was causing us problems was Domain Validation and the attached RAID box did not support Domain Validation). This functionality was not in older versions of the Adaptec driver, so the problem did not show up.
I've narrowed the problem to not be related to the SCSI card at all by taking it out completely and trying to install on an IDE disk. The problem still happened even with the latest suse kernel (2.4.21-211-smp). Bryan -- Aspen Systems, Inc. | http://www.aspsys.com/ Production Engineer | Phone: (303)431-4606 bryans@aspsys.com | Fax: (303)431-7196 -- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
participants (6)
-
Andi Kleen
-
Bryan Stillwell
-
Curt Purdy
-
Kevin_Gassiot@veritasdgc.com
-
Mark Horton
-
William A. Mahaffey III