Re: [suse-amd64] Tyan K8S >4GB
Looking at this one, this might be broken hardware - or an too old kernel. Can you try the latest ones from ftp.suse.com/pub/suse/x86-64/supplementary/kernel Do they work better? Andreas
MCG_STATUS: unrecoverable Northbridge Machine Check exception f435a00077080a13 0 Lost at least one NB error condition Uncorrectable condition Unrecoverable condition Northbridge status f435a00077080a13 ECC syndrome bits 776b extended error chipkill ecc error link number 0 uncorrected ecc error error address valid error enable error uncorrected error overflow previous error lost error address 0000000100100048 Address: 0000000100100048 MCE at EIP ffffffff8010ce2f ESP 100efd43fd8 CPU 1: Machine Check Exception: 0000000000000000 Kernel panic: Unable to continue In idle task - not syncing NMI Watchdog detected LOCKUP on CPU1, eip ffffffff801191cc, registers: CPU 1 Pid: 0, comm: swapper Not tainted RIP: 0010:[<ffffffff801191cc>]{.text.lock.smp+23} RSP: 0018:00000100efd43d88 EFLAGS: 00000086 RAX: 0000000000000000 RBX: ffffffff802e43da RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff80119060 RBP: 0000000000000005 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff803e55b0 R12: 0000000000000411 R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffffffff804bb880(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ef956000 CR4: 00000000000006e0
Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42} Process swapper (pid: 0, stackpage=100efd43000) Stack: 00000100efd43d88 0000000000000018 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42}
Code: f3 90 7e f5 e9 13 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 console shuts up ...
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Thanks so much for looking at the outputs. I will grab the new kernel. Interesting developments are as follows: I final got a RAID card I had been waiting for. I installed the card and attempted to boot the system. The system would not post. I removed the new controller and everything else. Eventually got down to one CPU and one DIMM. The board died. We are in the process of RMAing it. Hopefully this will solve all of the problems we were seeing. Thanks so much for everyone's help. If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs and 6GB DDR, I should be able to tell soon. Thanks! Santiago -----Original Message----- From: Andreas Jaeger [mailto:aj@suse.de] Sent: Saturday, September 13, 2003 7:50 AM To: Santiago Flores Cc: suse-amd64@suse.com Subject: Re: [suse-amd64] Tyan K8S >4GB Looking at this one, this might be broken hardware - or an too old kernel. Can you try the latest ones from ftp.suse.com/pub/suse/x86-64/supplementary/kernel Do they work better? Andreas
MCG_STATUS: unrecoverable Northbridge Machine Check exception f435a00077080a13 0 Lost at least one NB error condition Uncorrectable condition Unrecoverable condition Northbridge status f435a00077080a13 ECC syndrome bits 776b extended error chipkill ecc error link number 0 uncorrected ecc error error address valid error enable error uncorrected error overflow previous error lost error address 0000000100100048 Address: 0000000100100048 MCE at EIP ffffffff8010ce2f ESP 100efd43fd8 CPU 1: Machine Check Exception: 0000000000000000 Kernel panic: Unable to continue In idle task - not syncing NMI Watchdog detected LOCKUP on CPU1, eip ffffffff801191cc, registers: CPU 1 Pid: 0, comm: swapper Not tainted RIP: 0010:[<ffffffff801191cc>]{.text.lock.smp+23} RSP: 0018:00000100efd43d88 EFLAGS: 00000086 RAX: 0000000000000000 RBX: ffffffff802e43da RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff80119060 RBP: 0000000000000005 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff803e55b0 R12: 0000000000000411 R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffffffff804bb880(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ef956000 CR4: 00000000000006e0
Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42} Process swapper (pid: 0, stackpage=100efd43000) Stack: 00000100efd43d88 0000000000000018 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42}
Code: f3 90 7e f5 e9 13 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 console shuts up ...
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
"Santiago Flores" <santi@mleads.com> writes:
Thanks so much for looking at the outputs. I will grab the new kernel. Interesting developments are as follows:
I final got a RAID card I had been waiting for. I installed the card and
Which RAID card is this? Andreas
attempted to boot the system. The system would not post. I removed the new controller and everything else. Eventually got down to one CPU and one DIMM. The board died. We are in the process of RMAing it. Hopefully this will solve all of the problems we were seeing. Thanks so much for everyone's help.
If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs and 6GB DDR, I should be able to tell soon.
Thanks!
Santiago
-----Original Message----- From: Andreas Jaeger [mailto:aj@suse.de] Sent: Saturday, September 13, 2003 7:50 AM To: Santiago Flores Cc: suse-amd64@suse.com Subject: Re: [suse-amd64] Tyan K8S >4GB
Looking at this one, this might be broken hardware - or an too old kernel. Can you try the latest ones from ftp.suse.com/pub/suse/x86-64/supplementary/kernel
Do they work better?
Andreas
MCG_STATUS: unrecoverable Northbridge Machine Check exception f435a00077080a13 0 Lost at least one NB error condition Uncorrectable condition Unrecoverable condition Northbridge status f435a00077080a13 ECC syndrome bits 776b extended error chipkill ecc error link number 0 uncorrected ecc error error address valid error enable error uncorrected error overflow previous error lost error address 0000000100100048 Address: 0000000100100048 MCE at EIP ffffffff8010ce2f ESP 100efd43fd8 CPU 1: Machine Check Exception: 0000000000000000 Kernel panic: Unable to continue In idle task - not syncing NMI Watchdog detected LOCKUP on CPU1, eip ffffffff801191cc, registers: CPU 1 Pid: 0, comm: swapper Not tainted RIP: 0010:[<ffffffff801191cc>]{.text.lock.smp+23} RSP: 0018:00000100efd43d88 EFLAGS: 00000086 RAX: 0000000000000000 RBX: ffffffff802e43da RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff80119060 RBP: 0000000000000005 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff803e55b0 R12: 0000000000000411 R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffffffff804bb880(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ef956000 CR4: 00000000000006e0
Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42} Process swapper (pid: 0, stackpage=100efd43000) Stack: 00000100efd43d88 0000000000000018 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42}
Code: f3 90 7e f5 e9 13 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 console shuts up ...
Andreas
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
LSI Logic 320-4x http://www.lsilogic.com/products/stor_prod/raid/3204x.html -----Original Message----- From: Andreas Jaeger [mailto:aj@suse.de] Sent: Monday, September 15, 2003 8:00 AM To: Santiago Flores Cc: suse-amd64@suse.com Subject: Re: [suse-amd64] Tyan K8S >4GB "Santiago Flores" <santi@mleads.com> writes:
Thanks so much for looking at the outputs. I will grab the new kernel. Interesting developments are as follows:
I final got a RAID card I had been waiting for. I installed the card and
Which RAID card is this? Andreas
attempted to boot the system. The system would not post. I removed the new controller and everything else. Eventually got down to one CPU and one DIMM. The board died. We are in the process of RMAing it. Hopefully this will solve all of the problems we were seeing. Thanks so much for everyone's help.
If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs and 6GB DDR, I should be able to tell soon.
Thanks!
Santiago
-----Original Message----- From: Andreas Jaeger [mailto:aj@suse.de] Sent: Saturday, September 13, 2003 7:50 AM To: Santiago Flores Cc: suse-amd64@suse.com Subject: Re: [suse-amd64] Tyan K8S >4GB
Looking at this one, this might be broken hardware - or an too old kernel. Can you try the latest ones from ftp.suse.com/pub/suse/x86-64/supplementary/kernel
Do they work better?
Andreas
MCG_STATUS: unrecoverable Northbridge Machine Check exception f435a00077080a13 0 Lost at least one NB error condition Uncorrectable condition Unrecoverable condition Northbridge status f435a00077080a13 ECC syndrome bits 776b extended error chipkill ecc error link number 0 uncorrected ecc error error address valid error enable error uncorrected error overflow previous error lost error address 0000000100100048 Address: 0000000100100048 MCE at EIP ffffffff8010ce2f ESP 100efd43fd8 CPU 1: Machine Check Exception: 0000000000000000 Kernel panic: Unable to continue In idle task - not syncing NMI Watchdog detected LOCKUP on CPU1, eip ffffffff801191cc, registers: CPU 1 Pid: 0, comm: swapper Not tainted RIP: 0010:[<ffffffff801191cc>]{.text.lock.smp+23} RSP: 0018:00000100efd43d88 EFLAGS: 00000086 RAX: 0000000000000000 RBX: ffffffff802e43da RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff80119060 RBP: 0000000000000005 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff803e55b0 R12: 0000000000000411 R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffffffff804bb880(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 00000000ef956000 CR4: 00000000000006e0
Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42} Process swapper (pid: 0, stackpage=100efd43000) Stack: 00000100efd43d88 0000000000000018 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: <EOE> [<ffffffff80119060>]{stop_this_cpu+0} [<ffffffff801190a9>]{smp_send_stop+25} [<ffffffff8012204d>]{panic+285} [<ffffffff801225fe>]{__call_console_drivers+62} [<ffffffff8011c4f1>]{check_k8_nb+625} [<ffffffff8011c164>]{generic_machine_check+404} [<ffffffff8011c206>]{do_machine_check+86} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010f7c2>]{error_exit+0} [<ffffffff8010ce10>]{default_idle+0} [<ffffffff8010ce2f>]{default_idle+31} [<ffffffff8010ce9a>]{cpu_idle+42}
Code: f3 90 7e f5 e9 13 fe ff ff 90 90 90 90 90 90 90 90 90 90 90 console shuts up ...
Andreas
Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
"Santiago Flores" <santi@mleads.com> writes:
LSI Logic 320-4x http://www.lsilogic.com/products/stor_prod/raid/3204x.html
That one should work, Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On Monday 15 September 2003 16:51, Santiago Flores wrote:
I final got a RAID card I had been waiting for. I installed the card and attempted to boot the system. The system would not post. I removed the new controller and everything else. Eventually got down to one CPU and one DIMM. The board died. We are in the process of RMAing it. Hopefully this will solve all of the problems we were seeing. Thanks so much for everyone's help. Very strange.. I had exactly the same problem with a Tyan K8S, 6GB RAM and an LSI 320 Raid controller.. My board died also, there were some components burned between the PCI slots, VGA chip and BIOS. We also had to RMA the board, but it looked like the Opterons died too and/or the memory was broken..
If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs and 6GB DDR, I should be able to tell soon. Well, I would be very much interested if it is going to work for you, because i'm already thinking of removing 2GB and put the server into production (after a few more stability tests ofcourse ;))
4GB, which is strange, because on this list I see multiple people having
I also mailed with someone from AMD, he said he never heard of problems with that problem. At the moment I can't do very much with my system, either the Opterons died or the memory did, so we are replacing the Opterons first and hopefully I can debug it further, i want this matter solved!
Santiago -kees
On Mon, 15 Sep 2003 17:54:55 +0200 Kees Hoekzema <kees@tweakers.net> wrote:
If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs and 6GB DDR, I should be able to tell soon. Well, I would be very much interested if it is going to work for you, because i'm already thinking of removing 2GB and put the server into production (after a few more stability tests ofcourse ;))
4GB, which is strange, because on this list I see multiple people having
I also mailed with someone from AMD, he said he never heard of problems with that problem.
In theory it could be a driver issue. When the drivers do not support 64bit addresses on the bus the kernel has to use the IOMMU code to remap buffers to 32bit as needed. This code is only enabled when you use >4GB. But at least the LSI should support 64bit addresses. In theory it is possible that they have some IOMMU related bug though. The driver has to use special APIs for it. I usually run my boxes with forced IOMMU mode (iommu=force) to catch this early, but we haven't had problems with this in a long time. You could check it e.g. by booting with less than 4GB and setting iommu=force. If that also causes lockups (and without forcing it runs stable with <=4GB) then it is likely a driver issue. To rule out hardware issues I would physically remove the DIMMs, not just boot with mem=4GB -Andi
I will keep all of that in mind when I get the RMAed board back. The motherboard died so I haven't been able to even boot with the card yet. I am anxious to get to it though. Thanks for the great comments and ideas on getting it all running smoothly. Santiago -----Original Message----- From: Andi Kleen [mailto:ak@suse.de] Sent: Monday, September 15, 2003 9:42 AM To: Kees Hoekzema Cc: santi@mleads.com; suse-amd64@suse.com Subject: Re: [suse-amd64] Tyan K8S >4GB On Mon, 15 Sep 2003 17:54:55 +0200 Kees Hoekzema <kees@tweakers.net> wrote:
If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs
and
6GB DDR, I should be able to tell soon. Well, I would be very much interested if it is going to work for you, because i'm already thinking of removing 2GB and put the server into production (after a few more stability tests ofcourse ;))
4GB, which is strange, because on this list I see multiple people having
I also mailed with someone from AMD, he said he never heard of problems with that problem.
In theory it could be a driver issue. When the drivers do not support 64bit addresses on the bus the kernel has to use the IOMMU code to remap buffers to 32bit as needed. This code is only enabled when you use >4GB. But at least the LSI should support 64bit addresses. In theory it is possible that they have some IOMMU related bug though. The driver has to use special APIs for it. I usually run my boxes with forced IOMMU mode (iommu=force) to catch this early, but we haven't had problems with this in a long time. You could check it e.g. by booting with less than 4GB and setting iommu=force. If that also causes lockups (and without forcing it runs stable with <=4GB) then it is likely a driver issue. To rule out hardware issues I would physically remove the DIMMs, not just boot with mem=4GB -Andi -- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
Wow! That is pretty serious. Did the same thing happen to the new board? I am very curious as to what experiences you are having with your set up. We had some hints that our board might be failing prior to installing the 320-4x. So I can't say that it was the card. I would be sad if the new board (arriving tomorrow) gets fried when I plug the LSI controller back into it, although I would be very suprised. Have you talked with LSI tech support about the problem? They are easier to get a hold of then the LSI sales people. Please post back to the list or email me off list. THanks, Santiago -----Original Message----- From: Kees Hoekzema [mailto:kees@tweakers.net] Sent: Monday, September 15, 2003 8:55 AM To: Santiago Flores Cc: suse-amd64@suse.com Subject: Re: [suse-amd64] Tyan K8S >4GB On Monday 15 September 2003 16:51, Santiago Flores wrote:
I final got a RAID card I had been waiting for. I installed the card and attempted to boot the system. The system would not post. I removed the new controller and everything else. Eventually got down to one CPU and one DIMM. The board died. We are in the process of RMAing it. Hopefully this will solve all of the problems we were seeing. Thanks so much for everyone's help. Very strange.. I had exactly the same problem with a Tyan K8S, 6GB RAM and an LSI 320 Raid controller.. My board died also, there were some components burned between the PCI slots, VGA chip and BIOS. We also had to RMA the board, but it looked like the Opterons died too and/or the memory was broken..
If any info is needed on the LSI 320-4x on an TYAN S2880 with 2 procs and 6GB DDR, I should be able to tell soon. Well, I would be very much interested if it is going to work for you, because i'm already thinking of removing 2GB and put the server into production (after a few more stability tests ofcourse ;))
4GB, which is strange, because on this list I see multiple people having
I also mailed with someone from AMD, he said he never heard of problems with that problem. At the moment I can't do very much with my system, either the Opterons died or the memory did, so we are replacing the Opterons first and hopefully I can debug it further, i want this matter solved!
Santiago -kees
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
Hello All, on a TYAN K8W (S2885) with 8GB reg. ECC DDR2700 (8pcs. 1GB) Corsair cert. Modules (shipped by TYAN with Mainboard) "free -m" SuSE-8.2-Prof-x86-64-Beta9 gives only: (in MB) total used free shared buffers cached 6986 235 6751 0 41 68 instead of expected ~7900MB Any clues? Kernel-Options? -- Mit freundlichsten Grüßen / With best regards ---- PS: Ask for our High Performance Computing 32/64bit Multi-CPU AMD OPTERON Systems! ---- ________________________________________________________________ Mike D. Frenz (Dipl.-Phys.) Mail: MikeRoHard Computersysteme -Geschäftsleitung- Kärntner Weg 6 MikeRoHard Computersysteme D-79111 Freiburg High Performance Computing GERMANY Tel. +49 (0)761 - 888 66 50 mailto: mike.frenz@mikerohard.de Fax +49 (0)761 - 888 66 52 Website:http://www.mikerohard.de ________________________________________________________________
"Mike D. Frenz" <mike.frenz@mikerohard.de> writes:
Hello All,
Please do not copy single developers directly, just post to the mailing list.
on a TYAN K8W (S2885) with 8GB reg. ECC DDR2700 (8pcs. 1GB) Corsair cert. Modules (shipped by TYAN with Mainboard) "free -m" SuSE-8.2-Prof-x86-64-Beta9 gives only: (in MB)
total used free shared buffers cached 6986 235 6751 0 41 68
instead of expected ~7900MB
Any clues? Kernel-Options?
The IOMMU will use some memory, otherwise check what /var/log/boot.msg reports, Andreas -- Andreas Jaeger, aj@suse.de, http://www.suse.de/~aj SuSE Linux AG, Deutschherrnstr. 15-19, 90429 Nürnberg, Germany GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On Thu, 23 Oct 2003 10:55:45 +0200 "Mike D. Frenz" <mike.frenz@mikerohard.de> wrote: First please don't send support mail personally to me or Andreas. Send them to the mailing list.
Hello All,
on a TYAN K8W (S2885) with 8GB reg. ECC DDR2700 (8pcs. 1GB) Corsair cert. Modules (shipped by TYAN with Mainboard) "free -m" SuSE-8.2-Prof-x86-64-Beta9 gives only: (in MB)
total used free shared buffers cached 6986 235 6751 0 41 68
instead of expected ~7900MB
Any clues? Kernel-Options?
Memory problems on Tyan boards seem to be come a FAQ ... Enable the "IOMMU" option in the BIOS. If that does not help: You can check the e820 map that the kernel outputs at the beginning of boot.msg. If it contains 8GB as "RAM" then it may be a kernel problem. If not then the BIOS/hardware didn't recognize it. Most likely it is the BIOS/hardware. All the Tyan boards seem to be somewhat flakey in what memory they accept in which slot etc. Switch the DIMMs around or replace them. -Andi
participants (5)
-
Andi Kleen
-
Andreas Jaeger
-
Kees Hoekzema
-
Mike D. Frenz
-
Santiago Flores