SUSE freezes during large file transfer
Hello, I'm running Suse Linux 9.1 Professional. Everything runs fine without any problems until I try to copy a large amount of data (about 1 gig) either to or from the machine. It consistently freezes during this large data transfer, and I can no longer ping or get any response from the machine. Once I noticed a message on the console, at the time of the freeze: -- Message from syslogd@hostname at 2004-05-29........ Hostname kernel: invalid operand: 0000 [1] SMP -- Here are some stats on the machine: -Tyan motherboard -Dual Opteron -SCSI raid using megaraid driver -Broadcom nics using tg3 driver Can anyone help? thanks, Kevin
I forgot to mention that this occurs with both kernel versions 2.5.4-52-smp and 2.6.5-7.63-smp thanks On Tuesday, June 1, 2004, at 11:16 AM, Kevin Lewandowski wrote:
Hello,
I'm running Suse Linux 9.1 Professional. Everything runs fine without any problems until I try to copy a large amount of data (about 1 gig) either to or from the machine. It consistently freezes during this large data transfer, and I can no longer ping or get any response from the machine.
Once I noticed a message on the console, at the time of the freeze: -- Message from syslogd@hostname at 2004-05-29........ Hostname kernel: invalid operand: 0000 [1] SMP --
Here are some stats on the machine: -Tyan motherboard -Dual Opteron -SCSI raid using megaraid driver -Broadcom nics using tg3 driver
Can anyone help?
thanks, Kevin
Am Tuesday 01 June 2004 20:23 schrieb Kevin Lewandowski:
... until I try to copy a large amount of data (about 1 gig) either to or from the machine. ... -Tyan motherboard -Dual Opteron -SCSI raid using megaraid driver -Broadcom nics using tg3 driver
Kevin, it sounds as your cards might be quite dense? In this case , before you spend weeks of driver analysis, invest a quarter of an hour to open the box, let some fans blow fresh cool air in between your cards (esp. Graphic and SCSI), memory sticks and onto the typical intense areas on your Tyan MoBo, and try your "Gigabit-Copy" again. Still the same behaviour? If yes still: I guess you have already checked the memory comptibility aspect? (See lots of corresponding articles earlier in this list.) - Manfred
Am Wednesday 02 June 2004 01:04 schrieb Manfred Knick: Sorry, Kevin, I apologize, seems it was too late tonight:
let some fans blow fresh cool air in between your cards (esp. Graphic correction: network , not Graphic and SCSI), memory sticks and onto the typical intense areas on your Tyan MoBo, --> do you use an on-board-network controller? and try your "Gigabit-Copy" again. Still the same behaviour?
Background: I had similar intermitting problems some years ago with an Onboard-Adaptec-Chip which was placed more or less below the PCI cards on that MoBo series; and additionally, being the lowest slot, the 3COM was shielded from air-flow by the parallel EIDE cables ;-( One slow-running fan blowing in-between the slots + carefully re-placing the broad parallel cables solved all those problems :-))
If yes still: I guess you have already checked the memory comptibility aspect? (See lots of corresponding articles earlier in this list.)
Looking forward to hearing your results! - Manfred -- Manfred Knick (Dipl.-Math.) Alramstr. 31, D-81371 München Telefon +49 (89) 76704-300 Telefax +49 (89) 76704-453 Mobil +49 (171) 4180 358
Thanks for the suggestions. I will try these today or tomorrow and post the results. FYI, the NIC is onboard, and roughly close to the AMI RAID card. I'll also try my test with the built-in 100mb nic, which is further away from the RAID card. thanks, Kevin On Wednesday, June 2, 2004, at 10:34 AM, Manfred Knick wrote:
Am Wednesday 02 June 2004 01:04 schrieb Manfred Knick:
Sorry, Kevin, I apologize, seems it was too late tonight:
let some fans blow fresh cool air in between your cards (esp. Graphic correction: network , not Graphic and SCSI), memory sticks and onto the typical intense areas on your Tyan MoBo, --> do you use an on-board-network controller? and try your "Gigabit-Copy" again. Still the same behaviour?
Background: I had similar intermitting problems some years ago with an Onboard-Adaptec-Chip which was placed more or less below the PCI cards on that MoBo series; and additionally, being the lowest slot, the 3COM was shielded from air-flow by the parallel EIDE cables ;-( One slow-running fan blowing in-between the slots + carefully re-placing the broad parallel cables solved all those problems :-))
If yes still: I guess you have already checked the memory comptibility aspect? (See lots of corresponding articles earlier in this list.)
Looking forward to hearing your results! - Manfred
-- Manfred Knick (Dipl.-Math.) Alramstr. 31, D-81371 München Telefon +49 (89) 76704-300 Telefax +49 (89) 76704-453 Mobil +49 (171) 4180 358
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
________________ Kevin Lewandowski Discogs - http://www.discogs.com
Okay, here's some more info on the problem. I re-ran the big copy with the cover off of the machine and it froze again. Next I tried the copy using the onboard intel nic (e100) and it succeeded. No freeze. Finally I tried the copy using the second broadcom gigabit nic and it froze. So now it must be either a driver issue with the broadcom nic or overheating maybe? The broadcom chip on the motherboard did not seem hot at all. thanks, Kevin On Wednesday, June 2, 2004, at 12:18 PM, Kevin Lewandowski wrote:
Thanks for the suggestions. I will try these today or tomorrow and post the results.
FYI, the NIC is onboard, and roughly close to the AMI RAID card. I'll also try my test with the built-in 100mb nic, which is further away from the RAID card.
thanks, Kevin
On Wednesday, June 2, 2004, at 10:34 AM, Manfred Knick wrote:
Am Wednesday 02 June 2004 01:04 schrieb Manfred Knick:
Sorry, Kevin, I apologize, seems it was too late tonight:
let some fans blow fresh cool air in between your cards (esp. Graphic correction: network , not Graphic and SCSI), memory sticks and onto the typical intense areas on your Tyan MoBo, --> do you use an on-board-network controller? and try your "Gigabit-Copy" again. Still the same behaviour?
Background: I had similar intermitting problems some years ago with an Onboard-Adaptec-Chip which was placed more or less below the PCI cards on that MoBo series; and additionally, being the lowest slot, the 3COM was shielded from air-flow by the parallel EIDE cables ;-( One slow-running fan blowing in-between the slots + carefully re-placing the broad parallel cables solved all those problems :-))
If yes still: I guess you have already checked the memory comptibility aspect? (See lots of corresponding articles earlier in this list.)
Looking forward to hearing your results! - Manfred
-- Manfred Knick (Dipl.-Math.) Alramstr. 31, D-81371 München Telefon +49 (89) 76704-300 Telefax +49 (89) 76704-453 Mobil +49 (171) 4180 358
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
________________ Kevin Lewandowski Discogs - http://www.discogs.com
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
________________ Kevin Lewandowski Discogs - http://www.discogs.com
Hi, All. I think I've solved this. It looks like the problem was the tg3 driver. I changed the configuration to use the bcm5700 driver (which is included with suse 9.1) and my big copy tests passed without freezing. On Wednesday, June 2, 2004, at 05:06 PM, Kevin Lewandowski wrote:
Okay, here's some more info on the problem.
I re-ran the big copy with the cover off of the machine and it froze again.
Next I tried the copy using the onboard intel nic (e100) and it succeeded. No freeze.
Finally I tried the copy using the second broadcom gigabit nic and it froze.
So now it must be either a driver issue with the broadcom nic or overheating maybe? The broadcom chip on the motherboard did not seem hot at all.
thanks, Kevin
On Wednesday, June 2, 2004, at 12:18 PM, Kevin Lewandowski wrote:
Thanks for the suggestions. I will try these today or tomorrow and post the results.
FYI, the NIC is onboard, and roughly close to the AMI RAID card. I'll also try my test with the built-in 100mb nic, which is further away from the RAID card.
thanks, Kevin
On Wednesday, June 2, 2004, at 10:34 AM, Manfred Knick wrote:
Am Wednesday 02 June 2004 01:04 schrieb Manfred Knick:
Sorry, Kevin, I apologize, seems it was too late tonight:
let some fans blow fresh cool air in between your cards (esp. Graphic correction: network , not Graphic and SCSI), memory sticks and onto the typical intense areas on your Tyan MoBo, --> do you use an on-board-network controller? and try your "Gigabit-Copy" again. Still the same behaviour?
Background: I had similar intermitting problems some years ago with an Onboard-Adaptec-Chip which was placed more or less below the PCI cards on that MoBo series; and additionally, being the lowest slot, the 3COM was shielded from air-flow by the parallel EIDE cables ;-( One slow-running fan blowing in-between the slots + carefully re-placing the broad parallel cables solved all those problems :-))
If yes still: I guess you have already checked the memory comptibility aspect? (See lots of corresponding articles earlier in this list.)
Looking forward to hearing your results! - Manfred
-- Manfred Knick (Dipl.-Math.) Alramstr. 31, D-81371 München Telefon +49 (89) 76704-300 Telefax +49 (89) 76704-453 Mobil +49 (171) 4180 358
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
________________ Kevin Lewandowski Discogs - http://www.discogs.com
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
________________ Kevin Lewandowski Discogs - http://www.discogs.com
-- Check the List-Unsubscribe header to unsubscribe For additional commands, email: suse-amd64-help@suse.com
________________ Kevin Lewandowski Discogs - http://www.discogs.com
On Thu, Jun 03, 2004 at 02:21:05PM -0700, Kevin Lewandowski wrote:
Hi, All. I think I've solved this. It looks like the problem was the tg3 driver. I changed the configuration to use the bcm5700 driver (which is included with suse 9.1) and my big copy tests passed without freezing.
What does lspci say about your network chip ? -Andi
On Thu, Jun 03, 2004 at 02:21:05PM -0700, Kevin Lewandowski wrote:
Hi, All. I think I've solved this. It looks like the problem was the tg3 driver. I changed the configuration to use the bcm5700 driver (which is included with suse 9.1) and my big copy tests passed without freezing.
What does lspci say about your network chip ?
0000:02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) Subsystem: Broadcom Corporation: Unknown device 1644 Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 24 Memory at fc9c0000 (64-bit, non-prefetchable) [size=fc9a0000] Memory at fc9b0000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at 00010000 [disabled] Capabilities: [40] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable- 0000:02:09.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) Subsystem: Broadcom Corporation: Unknown device 1644 Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 25 Memory at fc9f0000 (64-bit, non-prefetchable) [size=fc9d0000] Memory at fc9e0000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at 00010000 [disabled] Capabilities: [40] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable- ethtool says: # ethtool -i eth0 driver: bcm5700 version: 7.2.24 firmware-version: bus-info: 0000:02:09.0 # ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes
Only just saw the post, my laptop has the BCM5788 chip and the network would stall even when no transfer was going, ping the box and it would be there, later another ping and it would be gone. Then I was using the bcm5700 driver, changed to the tg3 driver and it's been OK over the last two months with different 2.6.x kernels. Regards Sid. Kevin Lewandowski wrote:
On Thu, Jun 03, 2004 at 02:21:05PM -0700, Kevin Lewandowski wrote:
Hi, All. I think I've solved this. It looks like the problem was the tg3 driver. I changed the configuration to use the bcm5700 driver (which is included with suse 9.1) and my big copy tests passed without freezing.
What does lspci say about your network chip ?
0000:02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) Subsystem: Broadcom Corporation: Unknown device 1644 Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 24 Memory at fc9c0000 (64-bit, non-prefetchable) [size=fc9a0000] Memory at fc9b0000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at 00010000 [disabled] Capabilities: [40] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable-
0000:02:09.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03) Subsystem: Broadcom Corporation: Unknown device 1644 Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 25 Memory at fc9f0000 (64-bit, non-prefetchable) [size=fc9d0000] Memory at fc9e0000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at 00010000 [disabled] Capabilities: [40] Capabilities: [48] Power Management version 2 Capabilities: [50] Vital Product Data Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/3 Enable-
ethtool says:
# ethtool -i eth0 driver: bcm5700 version: 7.2.24 firmware-version: bus-info: 0000:02:09.0
# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Link detected: yes
-- Sid Boyce .... Hamradio G3VBV and keen Flyer Linux Only Shop.
On Thu, 3 Jun 2004, Sid Boyce wrote:
Only just saw the post, my laptop has the BCM5788 chip and the network would stall even when no transfer was going, ping the box and it would be there, later another ping and it would be gone. Then I was using the bcm5700 driver, changed to the tg3 driver and it's been OK over the last two months with different 2.6.x kernels. Regards Sid.
Kevin Lewandowski wrote:
On Thu, Jun 03, 2004 at 02:21:05PM -0700, Kevin Lewandowski wrote:
Hi, All. I think I've solved this. It looks like the problem was the tg3 driver. I changed the configuration to use the bcm5700 driver (which is included with suse 9.1) and my big copy tests passed without freezing.
The bcm5700 works for 64bit installation, but is horrible on 32bit. The tg3 works for 32bit installations but isn't 64bit clean. Doesn't matter whether you're on 2.4 or 2.6 kernels. Bjørn -- Bjørn Tore Sund Phone: (+47) 555-84894 Stupidity is like a System administrator Fax: (+47) 555-89672 fractal; universal and Math. Department Mobile: (+47) 918 68075 infinitely repetitive. University of Bergen VIP: 81724 Support: system@mi.uib.no Contact: teknisk@mi.uib.no Direct: bjornts@mi.uib.no
Andi Kleen wrote:
tg3 works for 32bit installations but isn't 64bit clean. Doesn't matter
That's not true. tg3 works fine on lots of 64bit machines.
-Andi
Correct, using bcm5700 on 64-bit laptop with a BCM5788 (100/1000) caused interface stalls at 100Mbps. Using tg3, it's been solid right up to the latest kernel 2.6.7-rc2-mm2 on SuSE 9.1 x86_64. Regards Sid. -- Sid Boyce .... Hamradio G3VBV and keen Flyer Linux Only Shop.
participants (5)
-
Andi Kleen
-
Bjorn Tore Sund
-
Kevin Lewandowski
-
Manfred Knick
-
Sid Boyce