[Bug 1174868] New: Upgrade to LEAP 15.2 makes Thinkpad 480s sometimes completely freezes, 20°C warmer and journal is trimmed
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 Bug ID: 1174868 Summary: Upgrade to LEAP 15.2 makes Thinkpad 480s sometimes completely freezes, 20°C warmer and journal is trimmed Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.2 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: markus.zimmermann@symflower.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- One of our machines here a Thinkpad T480s on openSUSE 15.2 kernel 5.3.18-lp152.33-default. In general this machine was upgraded from LEAP 15.1 to 15.2 and then the problems started to show. The CPU is constantly hotter while working. by 20°C. This machine now freezes sometimes completely. The machine is then - not pingable - no keyboard interaction works - lid close/open does nothng - no mouse interaction works - the notebook is really warm - When a freeze happens also a trim of the journal happens... e.g. we loose the whole journal for the day. - We SSH'ed into the machine and watched via "htop", "journal -f" and "sensors". WHen the machine freezes, there is no significant load via htop seeable, there are no interesting logs in journal and sensors shows that it is simply hotter. Interestingly, we upgraded another machine before (also a Thinkpad T480s) and this does not have any problems. We are btw on the latest firmwares using fwupdmgr. What should we try? How can we debug it? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c2 --- Comment #2 from Markus Zimmermann <markus.zimmermann@symflower.com> --- I have now access again to the machine in question. We installed the kernel from http://download.opensuse.org/repositories/Kernel:/openSUSE-15.1/standard/ on that machine. http://download.opensuse.org/repositories/Kernel:/openSUSE-15.1:/Update/stan... does not exist? Will test the kernel this week and will report on the end of the week how it went. Btw. we have another T480s with a fresh 15.2 (no upgrade) which had now 3 freezes in 2 weeks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c4 Markus Zimmermann <markus.zimmermann@symflower.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(markus.zimmermann | |@symflower.com) | --- Comment #4 from Markus Zimmermann <markus.zimmermann@symflower.com> --- We have tested the older kernel: still freezes. I will add "hwinfo" and "dmesg" outputs to this issue after this comment. We have now (on SUSE 15.2): - 1 Thinkpad T480s (with old and new Kernel) that freezes regularly (which i also cleaned, so there is no dust and the cooling-system looks good to me. this is btw the only notebook that has a higher temperature) - 1 Thinkpad T480s that freezes maybe every two days - 1 Thinkpad T480s that froze now once (since i reported this bug) - 2 Thinkpad T480s that never froze What bugs me the most is that "journal --since today" is cut off. Do you know why this could ever happen? Since downgrading to SUSE 15.1 is also not really an option (EOL is coming and we have to migrate at one point) I am as always willing to test anything because the alternative is to switch maybe to Tumbleweed or completely away. Can you pinpoint me to the "i915" issue or a repository that we should give a try? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c5 --- Comment #5 from Markus Zimmermann <markus.zimmermann@symflower.com> --- Created attachment 840542 --> http://bugzilla.opensuse.org/attachment.cgi?id=840542&action=edit Dmesg for current Kernel of 15.1 running on a 15.2 system -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c6 --- Comment #6 from Markus Zimmermann <markus.zimmermann@symflower.com> --- Created attachment 840543 --> http://bugzilla.opensuse.org/attachment.cgi?id=840543&action=edit Dmesg for current Kernel of 15.2 running on a 15.2 system -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c7 --- Comment #7 from Markus Zimmermann <markus.zimmermann@symflower.com> --- Created attachment 840544 --> http://bugzilla.opensuse.org/attachment.cgi?id=840544&action=edit Hwinfo for current Kernel of 15.1 running on a 15.2 system -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c8 --- Comment #8 from Markus Zimmermann <markus.zimmermann@symflower.com> --- Created attachment 840545 --> http://bugzilla.opensuse.org/attachment.cgi?id=840545&action=edit Hwinfo for current Kernel of 15.2 running on a 15.2 system -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c10 --- Comment #10 from Markus Zimmermann <markus.zimmermann@symflower.com> --- The freezes on one machine went away with disabling power-management, at least for two days there were no freezes. But the original machine has still freezes with the old SUSE 15.1 kernel and the newest 15.2 kernel. We are now testing http://download.opensuse.org/repositories/home:/tiwai:/bsc1174737-leap2/stan... if that makes any difference. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c12 --- Comment #12 from Markus Zimmermann <markus.zimmermann@symflower.com> --- With the kernel fromhttp://download.opensuse.org/repositories/home:/tiwai:/bsc1174737-leap2/stan... we have 0 (ZERO!) problems. So this is definitely a kernel issue. How should we proceed with this issue? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c13 --- Comment #13 from Takashi Iwai <tiwai@suse.com> --- Then the problem should have been already solved in the latest (or the upcoming) update. To verify it, please test the kernel in OBS Kernel:openSUSE-15.2 repo: http://download.opensuse.org/repositories/Kernel:/openSUSE-15.2/standard/ This contains the build from the latest git branch. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c14 --- Comment #14 from Markus Zimmermann <markus.zimmermann@symflower.com> --- OK we installed "5.3.18-lp152.105" on one machine now. If it works one that one we will try on the others and i let you know. In case this works: how long does an update stay in http://download.opensuse.org/repositories/Kernel:/openSUSE-15.2/standard/ before it is regularly available? Also, thanks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c15 --- Comment #15 from Takashi Iwai <tiwai@suse.com> --- The next update is planned in the next week or so. Let's cross fingers. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c16 --- Comment #16 from Markus Zimmermann <markus.zimmermann@symflower.com> --- I have two bad news - the latest of http://download.opensuse.org/repositories/Kernel:/openSUSE-15.2/standard/ does not work. We had two freezes yesterday and two already today. - i mentioned the wrong kernel. we have zero problems with http://download.opensuse.org/repositories/home:/tiwai:/kernel:/5.7/standard/ so with the 5.7 kernel. Which i took from https://bugzilla.suse.com/show_bug.cgi?id=1174737. Sorry for the confusion. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c17 --- Comment #17 from Takashi Iwai <tiwai@suse.com> --- Could you get any trace of crash somehow? Otherwise it's quite difficult to diagnose which part went wrong. You might be able to catch the crash via kdump, if we're lucky. But it's often not reliably working in such a case like the complete hardware freeze. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |sndirsch@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c19 --- Comment #19 from Markus Zimmermann <markus.zimmermann@symflower.com> --- Unfortunately we never had any trace at all. Even viewing the logs live did not show anything. The "journal" is always just gone completely. Even yesterdays data is removed. Since I never used kdump: can you recommend a tutorial/documentation? Is that a good starting point? https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c20 --- Comment #20 from Markus Zimmermann <markus.zimmermann@symflower.com> --- If you have any commands that we should run so you have more info, please let me know. I still do not know what the difference between our Thinkpad T480s is and why the Thinkpad T490 ones are not affected. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c21 --- Comment #21 from Takashi Iwai <tiwai@suse.com> --- (In reply to Markus Zimmermann from comment #19)
Since I never used kdump: can you recommend a tutorial/documentation?
https://doc.opensuse.org/documentation/leap/tuning/html/book.sle.tuning/cha-... Better to disable Secure Boot for enabling this feature, too. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c22 --- Comment #22 from Takashi Iwai <tiwai@suse.com> --- (In reply to Markus Zimmermann from comment #20)
If you have any commands that we should run so you have more info, please let me know. I still do not know what the difference between our Thinkpad T480s is and why the Thinkpad T490 ones are not affected.
FYI, we've had a report about T490 crash (also a hard one without trace), too, so it can't be excluded. But T490 has a totally different chip set, AFAIK, hence it's no wonder that the problem may hit only on T480. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c26 --- Comment #26 from Markus Zimmermann <markus.zimmermann@symflower.com> --- We are using http://download.opensuse.org/repositories/home:/tiwai:/kernel:/5.7/standard/ for a long time now and it is pretty outdated at this point...i guess... but it is the kernel that has no problems. I gave this issue another go because i moved from Tumbleweed to 15.2 because Tumbleweed suddnely started to have IO timeouts inside of VirtualBox VMs. So this broke my workflow completley, i couldn't work anymore. So i am now with everyone else on 15.2 which still has these freezes with the latest Kernel. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1174868 http://bugzilla.opensuse.org/show_bug.cgi?id=1174868#c31 Markus Zimmermann <markus.zimmermann@symflower.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |NORESPONSE --- Comment #31 from Markus Zimmermann <markus.zimmermann@symflower.com> --- AFAIK this is not an issue with 15.3 anymore. However, i am wondering if this is due to using https://github.com/erpalma/throttled which is a default in our development environment since a long time. Otherwise we cannot use the full potential of our CPUs, which is just sad. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com