[Bug 1213578] New: OOPS in amdgpu
https://bugzilla.suse.com/show_bug.cgi?id=1213578 Bug ID: 1213578 Summary: OOPS in amdgpu Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.5 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: aj@suse.com QA Contact: qa-bugs@suse.de Target Milestone: --- Found By: --- Blocker: --- I get an OOPs with both 5.14.21-150500.55.7-default and also with Takachi's 5.14.21-150500.3.g62ee467-default -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 Andreas Jaeger <aj@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|kernel-bugs@opensuse.org |tiwai@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c1 --- Comment #1 from Andreas Jaeger <aj@suse.com> --- Created attachment 868390 --> https://bugzilla.suse.com/attachment.cgi?id=868390&action=edit the two oops I could find in /var/log/messages -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c2 --- Comment #2 from Andreas Jaeger <aj@suse.com> --- hwinfo --gfxcard 31: PCI 500.0: 0300 VGA compatible controller (VGA) [Created at pci.386] Unique ID: Ddhb.uZbpCsxmrO5 Parent ID: JZZT.nyyq4tDu6x8 SysFS ID: /devices/pci0000:00/0000:00:08.1/0000:05:00.0 SysFS BusID: 0000:05:00.0 Hardware Class: graphics card Model: "ATI Picasso" Vendor: pci 0x1002 "ATI Technologies Inc" Device: pci 0x15d8 "Picasso" SubVendor: pci 0x17aa "Lenovo" SubDevice: pci 0x5127 Revision: 0xd1 Driver: "amdgpu" Driver Modules: "amdgpu" Memory Range: 0xc0000000-0xcfffffff (ro,non-prefetchable) Memory Range: 0xd0000000-0xd01fffff (ro,non-prefetchable) I/O Ports: 0x1000-0x1fff (rw) Memory Range: 0xd0500000-0xd057ffff (rw,non-prefetchable) IRQ: 50 (no events) Module Alias: "pci:v00001002d000015D8sv000017AAsd00005127bc03sc00i00" Driver Info #0: Driver Status: amdgpu is active Driver Activation Cmd: "modprobe amdgpu" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #25 (PCI bridge) Primary display adapter: #31 # hwinfo --monitor 35: None 00.0: 10002 LCD Monitor [Created at monitor.125] Unique ID: rdCR.mQXMLz_WQq5 Parent ID: Ddhb.uZbpCsxmrO5 Hardware Class: monitor Model: "AUO LCD Monitor" Vendor: AUO "AUO" Device: eisa 0x573d Serial ID: "0" Resolution: 1920x1080@60Hz Size: 309x174 mm Year of Manufacture: 2018 Week of Manufacture: 0 Detailed Timings #0: Resolution: 1920x1080 Horizontal: 1920 1936 1952 2080 (+16 +32 +160) -hsync Vertical: 1080 1083 1088 1142 (+3 +8 +62) -vsync Frequencies: 142.60 MHz, 68.56 kHz, 60.03 Hz Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #25 (VGA compatible controller) 36: None 01.0: 10002 LCD Monitor [Created at monitor.125] Unique ID: wkFv.zdQ3vHfjlr1 Parent ID: Ddhb.uZbpCsxmrO5 Hardware Class: monitor Model: "DELL U2419H" Vendor: DEL "DELL" Device: eisa 0x4148 "DELL U2419H" Serial ID: "5ZC7SS2" Resolution: 720x400@70Hz Resolution: 640x480@60Hz Resolution: 640x480@75Hz Resolution: 800x600@60Hz Resolution: 800x600@75Hz Resolution: 1024x768@60Hz Resolution: 1024x768@75Hz Resolution: 1280x1024@75Hz Resolution: 1152x864@75Hz Resolution: 1280x1024@60Hz Resolution: 1600x900@60Hz Resolution: 1920x1080@60Hz Size: 527x296 mm Year of Manufacture: 2019 Week of Manufacture: 44 Detailed Timings #0: Resolution: 1920x1080 Horizontal: 1920 2008 2052 2200 (+88 +132 +280) +hsync Vertical: 1080 1084 1089 1125 (+4 +9 +45) +vsync Frequencies: 148.50 MHz, 67.50 kHz, 60.00 Hz Driver Info #0: Max. Resolution: 1920x1080 Vert. Sync Range: 56-76 Hz Hor. Sync Range: 30-83 kHz Bandwidth: 148 MHz Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #25 (VGA compatible controller) 37: None 02.0: 10002 LCD Monitor [Created at monitor.125] Unique ID: +rIN.8N48X7gRWVA Parent ID: Ddhb.uZbpCsxmrO5 Hardware Class: monitor Model: "DELL U2414H" Vendor: DEL "DELL" Device: eisa 0xa0b2 "DELL U2414H" Serial ID: "X4J717CQ18UL" Resolution: 720x400@70Hz Resolution: 640x480@60Hz Resolution: 640x480@75Hz Resolution: 800x600@60Hz Resolution: 800x600@75Hz Resolution: 1024x768@60Hz Resolution: 1024x768@75Hz Resolution: 1280x1024@75Hz Resolution: 1152x864@75Hz Resolution: 1280x1024@60Hz Resolution: 1600x900@60Hz Resolution: 1600x1200@60Hz Resolution: 1920x1080@60Hz Size: 527x296 mm Year of Manufacture: 2017 Week of Manufacture: 52 Detailed Timings #0: Resolution: 1920x1080 Horizontal: 1920 2008 2052 2200 (+88 +132 +280) +hsync Vertical: 1080 1084 1089 1125 (+4 +9 +45) +vsync Frequencies: 148.50 MHz, 67.50 kHz, 60.00 Hz Driver Info #0: Max. Resolution: 1920x1080 Vert. Sync Range: 56-76 Hz Hor. Sync Range: 30-83 kHz Bandwidth: 148 MHz Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #25 (VGA compatible controller) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c3 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aj@suse.com Flags| |needinfo?(aj@suse.com) --- Comment #3 from Takashi Iwai <tiwai@suse.com> --- Thanks. This looks like the upstream issue https://gitlab.freedesktop.org/drm/amd/-/issues/2314 I'm building yet another test kernel with some backports in OBS home:tiwai:bsc1213578. Please give it a try later once after the build finishes. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c4 --- Comment #4 from Takashi Iwai <tiwai@suse.com> --- And, I'm building yet two more test kernels in OBS home:tiwai:bsc1213578-2 and home:tiwai:bsc1213578-3 repos. The first one is another upstream fix, and please test it in anyway to check whether it gives more regression or not. The latter one is a downstream fix for NULL dereferences, and this should work around the Oops, at least. If the previous two kernels don't work, please check this one. If this is the only one that works, I'll add this workaround for the next update. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c6 --- Comment #6 from Andreas Jaeger <aj@suse.com> --- Thanks, Takashi! Waiting for the builds now... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c7 --- Comment #7 from Andreas Jaeger <aj@suse.com> --- Booted kernel-default-5.14.21-150500.1.1.g0e39bed.x86_64 from https://build.opensuse.org/repositories/home:tiwai:bsc1213578 - crashed when starting X11. No oops after reboot. Now to the next one.. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c8 --- Comment #8 from Andreas Jaeger <aj@suse.com> --- I meant: No OOPS in /var/log/messages found -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c9 --- Comment #9 from Andreas Jaeger <aj@suse.com> --- Created attachment 868394 --> https://bugzilla.suse.com/attachment.cgi?id=868394&action=edit dmesg from home:tiwai:bsc1213578-2 home:tiwai:bsc1213578-2 crashed when connecting external monitors, attaching dmesg output. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c10 Andreas Jaeger <aj@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(aj@suse.com) | --- Comment #10 from Andreas Jaeger <aj@suse.com> --- Created attachment 868395 --> https://bugzilla.suse.com/attachment.cgi?id=868395&action=edit dmesg from home:tiwai:bsc1213578-3 home:tiwai:bsc1213578-3 produces an OOPS as well, see dmesg attachment. BUT: I report this now from the system with two external monitors attached, so it recovered. I booted up without external monitors and then connected them. $ uname -a Linux t495s 5.14.21-150500.1.g06f3d0e-default #1 SMP PREEMPT_DYNAMIC Mon Jul 24 08:36:58 UTC 2023 (06f3d0e) x86_64 x86_64 x86_64 GNU/Linux -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c11 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo? --- Comment #11 from Takashi Iwai <tiwai@suse.com> --- (In reply to Andreas Jaeger from comment #10)
Created attachment 868395 [details] dmesg from home:tiwai:bsc1213578-3
home:tiwai:bsc1213578-3 produces an OOPS as well, see dmesg attachment.
Those are no real crash but just kernel WARNINGs from ASSERT() macros. To be fixed, of course.
BUT: I report this now from the system with two external monitors attached, so it recovered. I booted up without external monitors and then connected them.
$ uname -a Linux t495s 5.14.21-150500.1.g06f3d0e-default #1 SMP PREEMPT_DYNAMIC Mon Jul 24 08:36:58 UTC 2023 (06f3d0e) x86_64 x86_64 x86_64 GNU/Linux
So, how is the behavior of *-3 kernel except for those kernel warnings? Does it still show other breakage? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c12 Andreas Jaeger <aj@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo? | --- Comment #12 from Andreas Jaeger <aj@suse.com> --- The latest kernel had initial a network connection problem and gnome-shell started without any extensions which I was later able to enable. After that I worked fine for an hour until I rebooted. I don't know whether the network and gnome-shell problems were related to the kernel. Let me try that kernel again... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c13 --- Comment #13 from Andreas Jaeger <aj@suse.com> --- Rebooted, all fine. Will use it for the next 2 hours and report if any problems arise. No OOPS/assert - booted this time with external monitors attached directly. uname -a Linux t495s 5.14.21-150500.1.g06f3d0e-default #1 SMP PREEMPT_DYNAMIC Mon Jul 24 08:36:58 UTC 2023 (06f3d0e) x86_64 x86_64 x86_64 GNU/Linux -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 Matthias Eckermann <mge@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mge@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c22 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(aj@suse.com) --- Comment #22 from Takashi Iwai <tiwai@suse.com> --- Is there more bug to be fixed with the latest SLE15-SP5 kernel? (At best check with the kernel in OBS Kernel:SLE15-SP5 repo.) If yes, could you elaborate how to trigger it? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c23 --- Comment #23 from Andreas Jaeger <aj@suse.com> --- Ok, download kernel from OBS Kernel:SLE15-SP5, uname -a reports: Linux t495s 5.14.21-150500.158.g6eb8d8a-default #1 SMP PREEMPT_DYNAMIC Thu Aug 3 12:29:06 UTC 2023 (6eb8d8a) x86_64 x86_64 x86_64 GNU/Linux Booted up fine, I'll run it now for some time and will then report back. Thanks, Takashi! -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c24 Andreas Jaeger <aj@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(aj@suse.com) | --- Comment #24 from Andreas Jaeger <aj@suse.com> --- Looking still fine! -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c25 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #25 from Takashi Iwai <tiwai@suse.com> --- OK, then let's close now. Feel free to reopen if you hit the same bug (but maybe better to open another entry as it can be a different problem). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c33 --- Comment #33 from Maintenance Automation <maint-coord+maintenance-robot@suse.de> --- SUSE-SU-2023:3302-1: An update that solves 28 vulnerabilities, contains two features and has 115 fixes can now be installed. Category: security (important) Bug References: 1150305, 1187829, 1193629, 1194869, 1206418, 1207129, 1207894, 1207948, 1208788, 1210335, 1210565, 1210584, 1210627, 1210780, 1210825, 1210853, 1211014, 1211131, 1211243, 1211738, 1211811, 1211867, 1212051, 1212256, 1212265, 1212301, 1212445, 1212456, 1212502, 1212525, 1212603, 1212604, 1212685, 1212766, 1212835, 1212838, 1212842, 1212846, 1212848, 1212861, 1212869, 1212892, 1212901, 1212905, 1212961, 1213010, 1213011, 1213012, 1213013, 1213014, 1213015, 1213016, 1213017, 1213018, 1213019, 1213020, 1213021, 1213024, 1213025, 1213032, 1213034, 1213035, 1213036, 1213037, 1213038, 1213039, 1213040, 1213041, 1213059, 1213061, 1213087, 1213088, 1213089, 1213090, 1213092, 1213093, 1213094, 1213095, 1213096, 1213098, 1213099, 1213100, 1213102, 1213103, 1213104, 1213105, 1213106, 1213107, 1213108, 1213109, 1213110, 1213111, 1213112, 1213113, 1213114, 1213116, 1213134, 1213167, 1213205, 1213206, 1213226, 1213233, 1213245, 1213247, 1213252, 1213258, 1213259, 1213263, 1213264, 1213272, 1213286, 1213287, 1213304, 1213417, 1213493, 1213523, 1213524, 1213533, 1213543, 1213578, 1213585, 1213586, 1213588, 1213601, 1213620, 1213632, 1213653, 1213705, 1213713, 1213715, 1213747, 1213756, 1213759, 1213777, 1213810, 1213812, 1213856, 1213857, 1213863, 1213867, 1213870, 1213871, 1213872 CVE References: CVE-2022-40982, CVE-2023-0459, CVE-2023-1829, CVE-2023-20569, CVE-2023-20593, CVE-2023-21400, CVE-2023-2156, CVE-2023-2166, CVE-2023-2430, CVE-2023-2985, CVE-2023-3090, CVE-2023-31083, CVE-2023-3111, CVE-2023-3117, CVE-2023-31248, CVE-2023-3212, CVE-2023-3268, CVE-2023-3389, CVE-2023-3390, CVE-2023-35001, CVE-2023-3567, CVE-2023-3609, CVE-2023-3611, CVE-2023-3776, CVE-2023-3812, CVE-2023-38409, CVE-2023-3863, CVE-2023-4004 Jira References: PED-4718, PED-4758 Sources used: openSUSE Leap 15.5 (src): kernel-livepatch-SLE15-SP5-RT_Update_3-1-150500.11.5.1, kernel-syms-rt-5.14.21-150500.13.11.1, kernel-source-rt-5.14.21-150500.13.11.1 SUSE Linux Enterprise Live Patching 15-SP5 (src): kernel-livepatch-SLE15-SP5-RT_Update_3-1-150500.11.5.1 SUSE Real Time Module 15-SP5 (src): kernel-syms-rt-5.14.21-150500.13.11.1, kernel-source-rt-5.14.21-150500.13.11.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c34 --- Comment #34 from Maintenance Automation <maint-coord+maintenance-robot@suse.de> --- SUSE-SU-2023:3311-1: An update that solves 15 vulnerabilities and has 27 fixes can now be installed. Category: security (important) Bug References: 1206418, 1207129, 1207948, 1210627, 1210780, 1210825, 1211131, 1211738, 1211811, 1212445, 1212502, 1212604, 1212766, 1212901, 1213167, 1213272, 1213287, 1213304, 1213417, 1213578, 1213585, 1213586, 1213588, 1213601, 1213620, 1213632, 1213653, 1213713, 1213715, 1213747, 1213756, 1213759, 1213777, 1213810, 1213812, 1213856, 1213857, 1213863, 1213867, 1213870, 1213871, 1213872 CVE References: CVE-2022-40982, CVE-2023-0459, CVE-2023-20569, CVE-2023-21400, CVE-2023-2156, CVE-2023-2166, CVE-2023-31083, CVE-2023-3268, CVE-2023-3567, CVE-2023-3609, CVE-2023-3611, CVE-2023-3776, CVE-2023-38409, CVE-2023-3863, CVE-2023-4004 Sources used: openSUSE Leap 15.5 (src): kernel-syms-5.14.21-150500.55.19.1, kernel-default-base-5.14.21-150500.55.19.1.150500.6.6.4, kernel-livepatch-SLE15-SP5_Update_3-1-150500.11.3.4, kernel-source-5.14.21-150500.55.19.1, kernel-obs-qa-5.14.21-150500.55.19.1, kernel-obs-build-5.14.21-150500.55.19.1 Basesystem Module 15-SP5 (src): kernel-default-base-5.14.21-150500.55.19.1.150500.6.6.4, kernel-source-5.14.21-150500.55.19.1 Development Tools Module 15-SP5 (src): kernel-obs-build-5.14.21-150500.55.19.1, kernel-syms-5.14.21-150500.55.19.1, kernel-source-5.14.21-150500.55.19.1 SUSE Linux Enterprise Live Patching 15-SP5 (src): kernel-livepatch-SLE15-SP5_Update_3-1-150500.11.3.4 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213578 https://bugzilla.suse.com/show_bug.cgi?id=1213578#c35 --- Comment #35 from Maintenance Automation <maint-coord+maintenance-robot@suse.de> --- SUSE-SU-2023:3376-1: An update that solves 15 vulnerabilities and has 27 fixes can now be installed. Category: security (important) Bug References: 1206418, 1207129, 1207948, 1210627, 1210780, 1210825, 1211131, 1211738, 1211811, 1212445, 1212502, 1212604, 1212766, 1212901, 1213167, 1213272, 1213287, 1213304, 1213417, 1213578, 1213585, 1213586, 1213588, 1213601, 1213620, 1213632, 1213653, 1213713, 1213715, 1213747, 1213756, 1213759, 1213777, 1213810, 1213812, 1213856, 1213857, 1213863, 1213867, 1213870, 1213871, 1213872 CVE References: CVE-2022-40982, CVE-2023-0459, CVE-2023-20569, CVE-2023-21400, CVE-2023-2156, CVE-2023-2166, CVE-2023-31083, CVE-2023-3268, CVE-2023-3567, CVE-2023-3609, CVE-2023-3611, CVE-2023-3776, CVE-2023-38409, CVE-2023-3863, CVE-2023-4004 Sources used: openSUSE Leap 15.5 (src): kernel-syms-azure-5.14.21-150500.33.14.1, kernel-source-azure-5.14.21-150500.33.14.1 Public Cloud Module 15-SP5 (src): kernel-syms-azure-5.14.21-150500.33.14.1, kernel-source-azure-5.14.21-150500.33.14.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com