[Bug 1136293] New: Getting complete system lockups since last week
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293 Bug ID: 1136293 Summary: Getting complete system lockups since last week Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: Other Status: NEW Severity: Major Priority: P5 - None Component: Other Assignee: bnc-team-screening@forge.provo.novell.com Reporter: samueldgv@pm.me QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I have been getting complete system freezes starting a week and half or so ago (I would assume starting from the 2019-05-16 or 2019-05-17 snapshot) The system freezes with a static color filling the whole screen and audio looping and cutting following a few seconds after it happens. It has been happening while playing OpenGL and Vulkan games but there also has been an occasion where it happened when I was just using Wxmaxima in the desktop. This makes me think that there is a mesa bug but so far I have not been able to get any information. I have tried checking both the individual application logs and the system logs and there is nothing whatsoever. I have also enabled Kdump with YaST and it has not triggered even after three occurrences so far, I'm open to suggestions on how to catch some more information next time. Maybe there is more people affected? I'm currently running kernel 5.1.3-1-default and Mesa 19.0.4 with no additional repositories enabled besides packman codecs, on a RYzen 2600X and an AMD Vega 56 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
Sam G
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c1
Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c2
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c3
--- Comment #3 from Sam G
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c4
Sam G
Please attach at least your /var/log/Xorg.0.log.
I have attached that one and the .old one. The crash makes it so that I have to forcefully shutdown the computer (no sysrq keys work, nothing) so it seems to be overwritten upon logging in again. In this case I have left a game open in order to trigger a freeze, which happened today, 29/05/2019 at approx. 22:00PM, note that it seems to say nothing, and I don't know how could I catch it. (In reply to Bernhard Wiedemann from comment #1)
Do you use the open source AMD graphics drivers or did you install proprietary ones?
Open drivers. I have never installed the propietary drivers -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c5
--- Comment #5 from Sam G
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c6
--- Comment #6 from Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c7
--- Comment #7 from Sam G
I'm afraid this is a kernel regression.
Oh well, thanks! It makes more sense now since mesa had not been really updated except for a bugfix release. Is there any info regarding that so that I can follow it? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c8
Bernhard Wiedemann
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c9
--- Comment #9 from Sam G
If it crashes the kernel, but cannot write traces to disk, you could try https://www.suse.com/c/netconsole-howto-send-kernel-boot-messages-over- ethernet/
if you have wired ethernet and another machine reachable. Sometimes that gives a nice backtrace that will help narrow down the source of the problem.
Thanks! I will try that. I also incidentally found a issue logged with freedesktop for this too. This indeed seems to be a problem for more people on different distros,not only Tumbleweed, and is related to amdgpu. This also keeps happening on kernel 5.1.5-1-default after the latest updates. Should I close this ticket or keep it open for keeping track of the original issue? The freedesktop issue is on https://bugs.freedesktop.org/show_bug.cgi?id=109955 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c13
--- Comment #13 from Sam G
And yet, I pushed a RADV fix to our kernel, so could you also check whether kernel from Kernel:stable works?
https://download.opensuse.org/repositories/Kernel:/stable/standard/x86_64/
No luck, though I have good news to, I guess. I have tested again on that kernel both an OpenGL game (Pillars of eternity, which for me is the easiest to reproduce freezes on) and a couple of Vulkan games (Surviving Mars via DXVK and a total war game via Vulkan,which I guess counts as DXVK too...) On the Vulkan side, I could not reproduce any freeze, though the freezes on Vulkan have been very few, especially lately. I got a crash after 15 minutes of testing on Pillars of Eternity (OpenGL) and another one right now, which I could get a trace of since it magically didn't freeze the whole system (it killed the game and seemed to break the whole Plasma and X sessions which were unresponsive, but other TTYs were usable), it's attached here and also in the freedesktop case now, which seem to be similar to the ones from other affected users. In the other occasions getting dmesg over the network didn't give anything. I will be checking more thoroughly the latest Kernel:stable you pointed me to on Vulkan games these days and then the 5.0.x one. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c14
--- Comment #14 from Sam G
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c15
Sam G
(In reply to Sam G from comment #9)
(In reply to Bernhard Wiedemann from comment #8)
If it crashes the kernel, but cannot write traces to disk, you could try https://www.suse.com/c/netconsole-howto-send-kernel-boot-messages-over- ethernet/
if you have wired ethernet and another machine reachable. Sometimes that gives a nice backtrace that will help narrow down the source of the problem.
Thanks! I will try that.
That would definitely help a lot, if you can grab some stack traces.
Other than that, I am afraid someone has to bisect the kernel: https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html
I assume 5.0.x worked and 5.1.x does not. You can also install 5.0.x from: https://download.opensuse.org/repositories/home:/tiwai:/kernel:/5.0/standard... x86_64/ to confirm this theory.
Should I close this ticket or keep it open for keeping track of the original issue?
You can keep it open and update it once upstream has some news.
FWIW, I have found that the only kernel not causing freezes and playing nicely on OpenGL is kernel-default-4.20.13-1.1, which I pulled from https://download.opensuse.org/repositories/home:/tiwai:/kernel:/4.20/standar... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c16
Shawn Peterson
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c17
--- Comment #17 from Shawn Peterson
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293#c19
--- Comment #19 from Sam G
http://bugzilla.opensuse.org/show_bug.cgi?id=1136293
Bernhard Wiedemann
participants (1)
-
bugzilla_noreply@novell.com