Hello, I'm running SuSE 10.1 with the latest updates on Opteron 148, ASUS SK8N with nForce3 and 2x 512MB Super Talent DDR333 Registered ECC. I'm experiencing hangups - everything freezes and nothing can be done. I think it may be a hardware problem, because yesterday there were several thunderstorms, but I'm not aware where to look and what to look for in order to identify it. What have I done: - memtest says nothing for half an hour; may be I should let it work more time; - mcelog says nothing; - /var/log/messages has nothing suspicious; Any help will be appreciated. Many thanks, Mitko
On 25/06/2006, at 9:50 PM, Dimitar Georgiev Popov wrote:
I'm running SuSE 10.1 with the latest updates on Opteron 148, ASUS SK8N with nForce3 and 2x 512MB Super Talent DDR333 Registered ECC. I'm experiencing hangups - everything freezes and nothing can be done. I think it may be a hardware problem, because yesterday there were several thunderstorms, but I'm not aware where to look and what to look for in order to identify it. What have I done: - memtest says nothing for half an hour; may be I should let it work more time; - mcelog says nothing; - /var/log/messages has nothing suspicious; Any help will be appreciated.
Hi Mitko. I had the same problem too! on my quad AMD 270's Tyan motherboard with 4GB of ram. I was able to traced it to the FAM daemon via DDD. Once I disabled it and all was OK. I have downloaded the newest version but I haven't tried it yet. Cheers. Grahame.
On Sunday 25 June 2006 15:42, you wrote:
On 25/06/2006, at 9:50 PM, Dimitar Georgiev Popov wrote:
I'm running SuSE 10.1 with the latest updates on Opteron 148, ASUS SK8N with nForce3 and 2x 512MB Super Talent DDR333 Registered ECC. I'm experiencing hangups - everything freezes and nothing can be done. I think it may be a hardware problem, because yesterday there were several thunderstorms, but I'm not aware where to look and what to look for in order to identify it. What have I done: - memtest says nothing for half an hour; may be I should let it work more time; - mcelog says nothing; - /var/log/messages has nothing suspicious; Any help will be appreciated.
Hi Mitko.
I had the same problem too! on my quad AMD 270's Tyan motherboard with 4GB of ram.
I was able to traced it to the FAM daemon via DDD. Once I disabled it and all was OK. I have downloaded the newest version but I haven't tried it yet.
Cheers. Grahame.
Hi, again. The problem turned out to be a buggy program - namely ktorrent; it worked well until yesterday; but now, it opens several HUNDRED kio_http processes; I updated the system yesterday and probably some of the libs have modified the program's behavior. Anyway, many thanks to you both Grahame and Ralph. Greetings, Mitko
On Sunday 25 June 2006 22:31, Dimitar Georgiev Popov wrote:
Hi, again.
The problem turned out to be a buggy program - namely ktorrent; it worked well until yesterday; but now, it opens several HUNDRED kio_http processes; I updated the system yesterday and probably some of the libs have modified the program's behavior. Anyway, many thanks to you both Grahame and Ralph.
Greetings, Mitko
Hi, Well, the problem is not ktorrent. I don't use it, but the hangs remain. I noticed that the system hangs whenever there is a high network load (e.g. when torrents are running). So I suspected the NIC (Realtec chip). I changed the NIC with a 3Com 3c905C-TX which works just fine on another system. But, in vain: more hangs. Once, I managed to save the syslog before the system hanged, and I found this: 07/08/2006 05:24:47 PM myhost kernel in_atomic():1, irqs_disabled():0 07/08/2006 05:24:47 PM myhost kernel <ffffffff8832543a>{:nvidia:_nv004467rm+38} <ffffffff80209e2e>{acpi_pci_allocate_irq+36} 07/08/2006 05:24:47 PM myhost kernel <ffffffff881b36e7>{:nvidia:_nv002125rm+39} <ffffffff8815304e>{:3c59x:boomerang_interrupt+733} 07/08/2006 05:24:47 PM myhost kernel <ffffffff881517eb>{:3c59x:vortex_up+172} <ffffffff88152c9b>{:3c59x:vortex_error+666} 07/08/2006 05:24:47 PM myhost kernel <ffffffff80404259>{_sinittext+601} 07/08/2006 05:24:47 PM myhost kernel <ffffffff80209f87>{acpi_pci_irq_enable+167} <ffffffff80209e0a>{acpi_pci_allocate_irq+0} 07/08/2006 05:24:47 PM myhost kernel <ffffffff801d38ad>{pci_enable_device_bars+39} <ffffffff801d38cd>{pci_enable_device+19} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8015f4fb>{kmem_cache_free+57} <ffffffff80134976>{__rcu_process_callbacks+295} 07/08/2006 05:24:47 PM myhost kernel <ffffffff801452a4>{handle_IRQ_event+41} <ffffffff80145342>{__do_IRQ+109} 07/08/2006 05:24:47 PM myhost kernel <ffffffff80134a1b>{rcu_process_callbacks+25} <ffffffff8012da5f>{process_timeout+0} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8012a70d>{tasklet_action+61} <ffffffff8012a7e0>{__do_softirq+70} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8011016e>{nommu_map_sg+0} <ffffffff80109619>{default_idle+42} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8010c2a7>{do_IRQ+64} <ffffffff801095ef>{default_idle+0} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8010c2a2>{do_IRQ+59} <ffffffff8010a918>{ret_from_intr+0} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8010b1aa>{call_softirq+30} <ffffffff8010c0a4>{do_softirq+44} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8010a918>{ret_from_intr+0} <EOI> <ffffffff802aa292>{thread_return+0} 07/08/2006 05:24:47 PM myhost kernel <ffffffff8010967e>{cpu_idle+61} <ffffffff8040477c>{start_kernel+456} 07/08/2006 05:24:47 PM myhost kernel eth2: Too much work in interrupt, status e003. 07/08/2006 05:24:47 PM myhost kernel eth2: PCI bus error, bus status 800000a0 07/08/2006 05:24:47 PM myhost kernel eth2: PCI bus error, bus status 80000020 ... 07/08/2006 05:24:47 PM myhost kernel eth2: PCI bus error, bus status 80000020 07/08/2006 05:24:47 PM myhost kernel eth2: Host error, FIFO diagnostic register 8000. 07/08/2006 05:24:47 PM myhost kernel eth2: Host error, FIFO diagnostic register 0000. ... 07/08/2006 05:24:47 PM myhost kernel eth2: Host error, FIFO diagnostic register 0000. 07/08/2006 05:24:47 PM myhost kernel Debug: sleeping function called from invalid context at include/asm/semaphore.h:105 07/08/2006 05:24:47 PM myhost kernel Call Trace: <IRQ> <ffffffff802096f7>{acpi_pci_link_allocate_irq+169} 07/08/2006 05:24:47 PM myhost kernel ACPI: PCI Interrupt 0000:01:06.0[A] -> Link [LNKD] -> GSI 17 (level, high) -> IRQ 193 ... 07/08/2006 05:24:47 PM myhost kernel ACPI: PCI Interrupt 0000:01:06.0[A] -> Link [LNKD] -> GSI 17 (level, high) -> IRQ 193 More info about eth2: # lspci -vv 01:06.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) Subsystem: 3Com Corporation 3C905C-TX Fast Etherlink for PC Management NIC Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort+ >SERR- <PERR- Latency: 64 (2500ns min, 2500ns max), Cache Line Size 08 Interrupt: pin A routed to IRQ 193 Region 0: I/O ports at dc00 [size=128] Region 1: Memory at fa9ffc00 (32-bit, non-prefetchable) [size=128] Expansion ROM at fa9c0000 [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=2 PME- # uname -a Linux myhost 2.6.16.13-4-default #1 Wed May 3 04:53:23 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux The mainboard is Asus SK8N with nForce3 chipset. I don't know what other information could be helpful, but I'm ready to provide it. Thanks for any help! Greetings, Mitko
On Saturday 08 July 2006 22:52, Dimitar Georgiev Popov wrote:
Well, the problem is not ktorrent. I don't use it, but the hangs remain. I noticed that the system hangs whenever there is a high network load (e.g. when torrents are running). So I suspected the NIC (Realtec chip). I changed the NIC with a 3Com 3c905C-TX which works just fine on another system. But, in vain: more hangs. Once, I managed to save the syslog before the system hanged, and I found this:
register 0000. 07/08/2006 05:24:47 PM myhost kernel Debug: sleeping
[snip]
# uname -a Linux myhost 2.6.16.13-4-default #1 Wed May 3 04:53:23 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux
The mainboard is Asus SK8N with nForce3 chipset. I don't know what other information could be helpful, but I'm ready to provide it.
Try booting with irqpoll as a boot parameter. ACPI may be setting up the wrong interrupts. This seems related to the nForce3 chipset or ASUS. It seems to have fixed my ASUS L5D laptop which also have an nForce3 chipset; and it also have lockups under "high" interrupt (network) load. -- /"\ Bernd Felsche - Innovative Reckoning, Perth, Western Australia \ / ASCII ribbon campaign | "Laws do not persuade just because X against HTML mail | they threaten." / \ and postings | Lucius Annaeus Seneca, c. 4BC - 65AD.
On Saturday 08 July 2006 18:56, Bernd Felsche wrote:
Try booting with irqpoll as a boot parameter. ACPI may be setting up the wrong interrupts. This seems related to the nForce3 chipset or ASUS.
It doesn't work still the same massages in the syslog, still the same behavior :( Greetings, Mitko
On Sunday 09 July 2006 01:56, Bernd Felsche wrote:
On Saturday 08 July 2006 22:52, Dimitar Georgiev Popov wrote:
Well, the problem is not ktorrent. I don't use it, but the hangs remain. I noticed that the system hangs whenever there is a high network load (e.g. when torrents are running). So I suspected the NIC (Realtec chip). I changed the NIC with a 3Com 3c905C-TX which works just fine on another system. But, in vain: more hangs. Once, I managed to save the syslog before the system hanged, and I found this:
register 0000. 07/08/2006 05:24:47 PM myhost kernel Debug: sleeping
[snip]
# uname -a Linux myhost 2.6.16.13-4-default #1 Wed May 3 04:53:23 UTC 2006 x86_64 x86_64 x86_64 GNU/Linux
The mainboard is Asus SK8N with nForce3 chipset. I don't know what other information could be helpful, but I'm ready to provide it.
Try booting with irqpoll as a boot parameter. ACPI may be setting up the wrong interrupts. This seems related to the nForce3 chipset or ASUS.
It seems to have fixed my ASUS L5D laptop which also have an nForce3 chipset; and it also have lockups under "high" interrupt (network) load.
I don't know if this is relevant, but my SuSE 10.0 hangs after using Knoqueror on the internet for awhile. If I don't connect to the internet I have no problems. (Haven't used my 10.1 machine on the internet.) Regards, Colin
participants (4)
-
Bernd Felsche
-
Colin Carter
-
Dimitar Georgiev Popov
-
Grahame Kelly