[Bug 1185516] New: Radeon driver from xf86-video-ati-19.1.0-3.1.x86_64 crashes, black screen, video issues
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516 Bug ID: 1185516 Summary: Radeon driver from xf86-video-ati-19.1.0-3.1.x86_64 crashes, black screen, video issues Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Tumbleweed Status: NEW Severity: Major Priority: P5 - None Component: X.Org Assignee: gfx-bugs@suse.de Reporter: bob@muhlenberg.edu QA Contact: gfx-bugs@suse.de Found By: --- Blocker: --- Created attachment 848930 --> http://bugzilla.opensuse.org/attachment.cgi?id=848930&action=edit dmesg showing exceptipon in radeon driver code Hardware is HPE DL-390 Gen 8 with two Radeon HD 6450 boards. this setup has been working with Tumbleweed for a couple years + and no issues with previous "zypper dup" updates which happen periodically. Performed update today and machine boots into GRUB, and the usual tumbleweed "spinner" on both monitors, etc. Then when display manager starts, one card presents no video ( no HDMI signal ) the other, which is the primary display, presents a black screen, no cursor / pointer. At this point console is unresponsive. Ctrl-Backspace x 2 does not restart the GUI. Nor does this generate the system bell "Beep!" On sshing into the system there is a defunct Xorg.bin process which usually can be hard killed to get a crippled TTY on the primary video display. In this state text is displayed, but there is no local echo of characaters being typed. e.g. you can type "ls" and a directory listing without carriage returns is displayed. using "clear" to reset the terminal does clear the screen, but the non-echo issue remains. Sometimes the process is not in a defunct state, e.g. /usr/bin/Xorg.bin :1 vt1 -keeptty -auth /root/.serverauth.3742 -nolisten tcp -nolisten tcp which is not killable. A TTY appears, but no cursor, and is not responsive. To regain a TTY on the console a reboot --force is necessary as a normal reboot / halt / shutdown etc. never completes - power cycling the system is now necessary. On setting default runlevel to 3, and rebooting, the syste, [resents a normal TTY console. on logging in an using startx, bypassing the GUI login, etc, the same issue is observed. The one screen loses HDMI signal, the other presents a black screen. Anyway, that's the narrative version. On to the logs: dmesg often reports a crash in the radeon driver as follows: [ 61.341857] radeon 0000:04:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=io+mem [ 61.341862] radeon 0000:21:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none [ 61.995557] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 61.995569] #PF: supervisor read access in kernel mode [ 61.995572] #PF: error_code(0x0000) - not-present page [ 61.995574] PGD 0 P4D 0 [ 61.995578] Oops: 0000 [#1] SMP PTI [ 61.995582] CPU: 19 PID: 2777 Comm: Xorg.bin Tainted: G S I 5.12.0-1-default #1 openSUSE Tumbleweed [ 61.995587] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 01/22/2018 [ 61.995590] RIP: 0010:radeon_gart_bind+0x3c/0xf0 [radeon] [ 61.995639] Code: 08 80 bf 98 04 00 00 00 0f 84 b3 00 00 00 c1 ee 0c 48 89 fd 45 89 ce 49 89 cf 8d 04 32 89 f3 4d 89 c5 89 44 24 04 85 d2 7e 66 <49> 8b 17 48 8b 85 88 04 00 00 41 89 dc 44 89 f6 4a 89 14 e0 48 8b [ 61.995643] RSP: 0018:ffffaad9495cfa40 EFLAGS: 00010202 [ 61.995646] RAX: 0000000000000a8d RBX: 00000000000002a4 RCX: 0000000000000000 [ 61.995648] RDX: 00000000000007e9 RSI: 00000000000002a4 RDI: ffff96e6c9cdc000 [ 61.995650] RBP: ffff96e6c9cdc000 R08: ffff96df4f358000 R09: 000000000000000f [ 61.995652] R10: 0000000000000000 R11: fffff053843cd708 R12: ffffaad9495cfb50 [ 61.995654] R13: ffff96df4f358000 R14: 000000000000000f R15: 0000000000000000 [ 61.995656] FS: 00007f8bbc53a940(0000) GS:ffff96e69f8c0000(0000) knlGS:0000000000000000 [ 61.995659] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 61.995661] CR2: 0000000000000000 CR3: 0000000201bfc003 CR4: 00000000001706e0 [ 61.995664] Call Trace: [ 61.995668] radeon_bo_move+0x374/0x6a0 [radeon] [ 61.995704] ttm_bo_handle_move_mem+0x90/0x170 [ttm] [ 61.995711] ttm_bo_validate+0x14d/0x180 [ttm] [ 61.995717] ttm_bo_init_reserved+0x18e/0x310 [ttm] [ 61.995722] ttm_bo_init+0x64/0xd0 [ttm] [ 61.995726] ? radeon_update_memory_usage.isra.0+0x40/0x40 [radeon] [ 61.995748] radeon_bo_create+0x184/0x200 [radeon] [ 61.995770] ? radeon_update_memory_usage.isra.0+0x40/0x40 [radeon] [ 61.995791] radeon_gem_prime_import_sg_table+0x5e/0xf0 [radeon] [ 61.995824] drm_gem_prime_import_dev.part.0+0x63/0xc0 [drm] [ 61.995863] drm_gem_prime_fd_to_handle+0x196/0x1d0 [drm] [ 61.995883] ? drm_prime_destroy_file_private+0x20/0x20 [drm] [ 61.995902] drm_ioctl_kernel+0xaa/0xf0 [drm] [ 61.995920] drm_ioctl+0x202/0x3b0 [drm] [ 61.995937] ? drm_prime_destroy_file_private+0x20/0x20 [drm] [ 61.995957] ? new_sync_write+0x11c/0x1b0 [ 61.995963] radeon_drm_ioctl+0x49/0x80 [radeon] [ 61.995986] __x64_sys_ioctl+0x83/0xb0 [ 61.995992] do_syscall_64+0x33/0x80 [ 61.995998] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 61.996004] RIP: 0033:0x7f8bbca520bb [ 61.996008] Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 bd 0c 00 f7 d8 64 89 01 48 [ 61.996012] RSP: 002b:00007ffefa013de8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 61.996016] RAX: ffffffffffffffda RBX: 00007ffefa013e2c RCX: 00007f8bbca520bb [ 61.996019] RDX: 00007ffefa013e2c RSI: 00000000c00c642e RDI: 0000000000000015 [ 61.996022] RBP: 00000000c00c642e R08: 00007ffefa013ed0 R09: 00007f8bbcb1ea60 [ 61.996025] R10: 00007f8bbb0012a0 R11: 0000000000000246 R12: 00005640a0d77fc0 [ 61.996028] R13: 0000000000000015 R14: 0000000000100000 R15: 00007ffefa0145e0 [ 61.996032] Modules linked in: af_packet nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nf_log_ipv6 nf_log_ipv4 nf_log_common nft_log nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security bridge stp llc iscsi_ibft iscsi_boot_sysfs ip_set nfnetlink ebtable_filter ebtables rfkill ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter dmi_sysfs ocrdma ib_uverbs ib_core intel_rapl_msr iTCO_wdt intel_pmc_bxt iTCO_vendor_support ipmi_ssif intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass pcspkr joydev be2net hpwdt hpilo lpc_ich snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_hwdep [ 61.996073] snd_pcm ioatdma snd_timer tg3 dca snd acpi_ipmi libphy soundcore ipmi_si thermal ipmi_devintf ipmi_msghandler tiny_power_button button fuse configfs hid_logitech_hidpp hid_logitech_dj hid_generic usbhid ata_generic radeon i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core uhci_hcd ehci_pci crct10dif_pclmul crc32_pclmul ehci_hcd crc32c_intel ghash_clmulni_intel drm aesni_intel usbcore crypto_simd cryptd hpsa ata_piix serio_raw scsi_transport_sas sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr [ 61.996122] CR2: 0000000000000000 [ 61.996125] ---[ end trace 65e096e6c12aea74 ]--- [ 62.006624] RIP: 0010:radeon_gart_bind+0x3c/0xf0 [radeon] [ 62.006681] Code: 08 80 bf 98 04 00 00 00 0f 84 b3 00 00 00 c1 ee 0c 48 89 fd 45 89 ce 49 89 cf 8d 04 32 89 f3 4d 89 c5 89 44 24 04 85 d2 7e 66 <49> 8b 17 48 8b 85 88 04 00 00 41 89 dc 44 89 f6 4a 89 14 e0 48 8b [ 62.006686] RSP: 0018:ffffaad9495cfa40 EFLAGS: 00010202 [ 62.006689] RAX: 0000000000000a8d RBX: 00000000000002a4 RCX: 0000000000000000 [ 62.006691] RDX: 00000000000007e9 RSI: 00000000000002a4 RDI: ffff96e6c9cdc000 [ 62.006694] RBP: ffff96e6c9cdc000 R08: ffff96df4f358000 R09: 000000000000000f [ 62.006697] R10: 0000000000000000 R11: fffff053843cd708 R12: ffffaad9495cfb50 [ 62.006699] R13: ffff96df4f358000 R14: 000000000000000f R15: 0000000000000000 [ 62.006702] FS: 00007f8bbc53a940(0000) GS:ffff96e69f8c0000(0000) knlGS:0000000000000000 [ 62.006705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 62.006708] CR2: 0000000000000000 CR3: 0000000201bfc003 CR4: 00000000001706e0 ( Will add full log files ) /var/log/Xorg.N.log seem unremarkable and does not show an error. However its not clear if it actually gets to log anything after the driver croaks. During troubleshooting, during one attempt, all other X drivers except radeon, were disabled, and the same issue occured, so it did not appear to be an issue with X probing for drivers. Also we tried removing one of the two cards. Issue persisted with one card present. Also tried swapping the cards. The issue persisted with any combination of cards or using them individually. Cards themselves are in risers, we swapped the risers too. We have an identical server without radeon cards and X has no issue with the latest code from zypper dup. ( Gotta go, will upload X logs soon or anything else someone wants. ) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c1
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c2
--- Comment #2 from Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c3
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c4
--- Comment #4 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c5
--- Comment #5 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c6
--- Comment #6 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c7
--- Comment #7 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c8
Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c12
--- Comment #12 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c13
--- Comment #13 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c14
--- Comment #14 from Robert Mahar
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c15
--- Comment #15 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c16
--- Comment #16 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c18
Felix Miata
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516
http://bugzilla.opensuse.org/show_bug.cgi?id=1185516#c19
Takashi Iwai
participants (1)
-
bugzilla_noreply@suse.com