[Bug 429262] New: Machine reboots suddenly when r6xx/ r7xx DRM initialized on x86_64 with 2.6.27 only
https://bugzilla.novell.com/show_bug.cgi?id=429262 User mhopf@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=429262#c1 Summary: Machine reboots suddenly when r6xx/r7xx DRM initialized on x86_64 with 2.6.27 only Product: openSUSE 11.1 Version: Beta 1 Platform: x86-64 OS/Version: Other Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: mhopf@novell.com QAContact: qa@suse.de CC: eich@novell.com, sndirsch@novell.com, lverhaegen@novell.com Found By: --- This machine (gkar) reboots suddenly without warning, and with no particular workload, when an Xserver is running with DRM enabled (DRM is a development version, not the one supplied with the kernel). There is no DRM interaction going on at the time the machine reboots. Actually, it can reboot any time, even when completely idle. Sometimes it reboots on Xserver start. It doesn't occur on i686 hardware. It doesn't occur with 11.0's kernel 2.6.25.16. It doesn't occur with R5xx GPUs. As the used DRM module is the same, I assume this has something to do with memory or interrupt management. The following lines upon drm initialization hit my eye (in the crashing case): Sep 23 19:53:37 gkar kernel: [drm] Initialized drm 1.1.0 20060810 Sep 23 19:53:37 gkar kernel: pci 0000:01:00.0: BAR 0: can't reserve mem region [0xd0000000-0xdfffffff] Sep 23 19:53:37 gkar kernel: vendor=1022 device=9603 Sep 23 19:53:37 gkar kernel: pci 0000:01:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 Sep 23 19:53:37 gkar kernel: pci 0000:01:00.0: setting latency timer to 64 Sep 23 19:53:37 gkar kernel: [drm] Initialized radeon 1.29.0 20080613 on minor 0 With 2.6.25, the following is printed instead: Sep 23 20:18:25 gkar kernel: [drm] Initialized drm 1.1.0 20060810 Sep 23 20:18:25 gkar kernel: PCI: Unable to reserve mem region #1:10000000@d0000000 for device 0000:01:00.0 Sep 23 20:18:25 gkar kernel: ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 18 (level, low) -> IRQ 18 Sep 23 20:18:25 gkar kernel: PCI: Setting latency timer of device 0000:01:00.0 to 64 Sep 23 20:18:25 gkar kernel: [drm] Initialized radeon 1.29.0 20080613 on minor 0 I'd need some advice how to debug this issue. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=429262
User mhopf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c1
--- Comment #1 from Matthias Hopf
It doesn't occur on i686 hardware.
I mean it doesn't occur with a 32bit kernel. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=429262
User sndirsch@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c2
--- Comment #2 from Stefan Dirsch
(DRM is a development version, not the one supplied with the kernel).
What are the differences to the DRM in our kernel? Usually you don't get *any* help by our kernel developers, if you don't use the one supplied with our kernel. :-( -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=429262
User mhopf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c3
--- Comment #3 from Matthias Hopf
https://bugzilla.novell.com/show_bug.cgi?id=429262
Christoph Thiel
https://bugzilla.novell.com/show_bug.cgi?id=429262
User sndirsch@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c4
Stefan Dirsch
https://bugzilla.novell.com/show_bug.cgi?id=429262
Christoph Thiel
https://bugzilla.novell.com/show_bug.cgi?id=429262
User mhopf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c5
Matthias Hopf
This machine (gkar) reboots suddenly without warning, and with no particular
For the record, machine isn't named gkar any more. More specs about the system: CPUS: 4x processor : 0 vendor_id : AuthenticAMD cpu family : 16 model : 4 model name : AMD Engineering Sample stepping : 1 cpu MHz : 2600.183 cache size : 512 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt ts ttp tm stc 100mhzsteps hwpstate bogomips : 5204.13 clflush size : 64 lspci: 00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge 00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (int gfx) 00:0a.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 5) 00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [IDE mode] 00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:12.1 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI1 Controller 00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:13.1 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI1 Controller 00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3a) 00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller 00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia 00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge 00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h [Opteron, Athlon64, Sempron] Link Control 01:05.0 VGA compatible controller: ATI Technologies Inc Radeon HD 3200 Graphics 01:05.1 Audio device: ATI Technologies Inc RS780 Azalia controller 02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5755 Gigabit Ethernet PCI Express (rev 02) 03:05.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=429262
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c6
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=429262
User mhopf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c7
Matthias Hopf
Is the "new" drm code properly asking for the interrupt from the kernel core?
radeon DRM is always setting an irq handler .irq_handler = radeon_driver_irq_handler, It's initialized for soft IRQs (created by the command processor when the command in the ring indicates to issue an IRQ), and (eventually) for vertical blanks, but that is turned off if no DRI client is active. I don't exactly know how the preinstall, postinstall, uninstall functions of a device driver are called. But the card doesn't issue any IRQs, because the reboot happens when the system is perfectly idle.
It looks like the message at boot is different, which is very odd.
I think I loaded drm with debug=1. That might make a difference. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=429262
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=429262
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c8
Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=429262
User mhopf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c9
--- Comment #9 from Matthias Hopf
https://bugzilla.novell.com/show_bug.cgi?id=429262
User mhopf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c10
Matthias Hopf
https://bugzilla.novell.com/show_bug.cgi?id=429262
User gregkh@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=429262#c11
--- Comment #11 from Greg Kroah-Hartman
participants (1)
-
bugzilla_noreply@novell.com