http://bugzilla.suse.com/show_bug.cgi?id=1013563
Bug ID: 1013563 Summary: dell 7470 - cpu gets throttled a lot Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: tchvatal@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: ---
With a normal usage, on table, browsing+ssh the machine quite often throtlles:
[39814.739914] mei_wdt mei::05b79a6f-4628-4d7f-899d-a91514cb32ab:02: Could not register event ret=-22 [39814.741816] mei_wdt: probe of mei::05b79a6f-4628-4d7f-899d-a91514cb32ab:02 failed with error -22 [39867.021572] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1355) [39867.021573] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1355) [39867.021574] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1355) [39867.021575] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1355) [39867.021576] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1355) [39867.021580] mce_notify_irq: 1 callbacks suppressed [39867.021581] mce: [Hardware Error]: Machine check events logged [39867.021583] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1355) [39867.021585] mce: [Hardware Error]: Machine check events logged [39867.022565] CPU1: Core temperature/speed normal [39867.022566] CPU3: Core temperature/speed normal [39867.022567] CPU0: Package temperature/speed normal [39867.022567] CPU2: Package temperature/speed normal [39867.022568] CPU3: Package temperature/speed normal [39867.022569] CPU1: Package temperature/speed normal
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c1
Takashi Iwai tiwai@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |tchvatal@suse.com, | |tiwai@suse.com Flags| |needinfo?(tchvatal@suse.com | |)
--- Comment #1 from Takashi Iwai tiwai@suse.com --- I haven't noticed it while using openSUSE-42.2 (but on a different one, Dell E7270). Could you check whether it happens with openSUSE-42.2 kernel?
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c2
Tomáš Chvátal tchvatal@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(tchvatal@suse.com | |) |
--- Comment #2 from Tomáš Chvátal tchvatal@suse.com --- Happens with 4.4.35 but not so often... This time I had to actually stress the machine by playing video (even then I don't consider watching fullhd movie to be so rough on the usage).
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c3
--- Comment #3 from Tomáš Chvátal tchvatal@suse.com --- Also updated bios to latest (had 1.6 now is 1.10) and tried kernel 4.9rc7.
Still present there.
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c4
--- Comment #4 from Takashi Iwai tiwai@suse.com --- (In reply to Tomáš Chvátal from comment #3)
Also updated bios to latest (had 1.6 now is 1.10) and tried kernel 4.9rc7.
Still present there.
Care to report to upstream, e.g. bugzilla.kernel.org? Feel free to put me (tiwai@suse.de) in Cc.
http://bugzilla.suse.com/show_bug.cgi?id=1013563
Tomáš Chvátal tchvatal@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- URL| |https://bugzilla.kernel.org | |/show_bug.cgi?id=189711
http://bugzilla.suse.com/show_bug.cgi?id=1013563
Takashi Iwai tiwai@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(tchvatal@suse.com | |)
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c5
Tomáš Chvátal tchvatal@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(tchvatal@suse.com | |) |
--- Comment #5 from Tomáš Chvátal tchvatal@suse.com --- Well I did that https://bugzilla.kernel.org/show_bug.cgi?id=189711 :)
If you need something else just ask or I can show the lappy next week when I am in NUE.
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c6
Takashi Iwai tiwai@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED See Also| |https://bugzilla.kernel.org | |/show_bug.cgi?id=189711 Resolution|--- |UPSTREAM
--- Comment #6 from Takashi Iwai tiwai@suse.com --- That's fine, then let's track the issue in the upstream bugzilla. Thanks.
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c7
Tomáš Chvátal tchvatal@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED CC| |mmarek@suse.com Resolution|UPSTREAM |---
--- Comment #7 from Tomáš Chvátal tchvatal@suse.com --- I am reopening this because I discovered that if install thermald and enable the service on the laptop the MCE eerors disappear. Probably the termald sorts out the throttling and other issues.
I would say we should enable this service by default on new cpus and probably install it in laptop pattern.
Adding michal to cc per his request :)
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c8
Takashi Iwai tiwai@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |fvogt@suse.com, | |trenn@suse.com
--- Comment #8 from Takashi Iwai tiwai@suse.com --- We can add supplements to thermald specifying the CPU IDs that require thermald, too, instead. Adding thermald maintainters to Cc.
BTW, does this happen also with SLE12-SP2 kernel?
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c9
--- Comment #9 from Tomáš Chvátal tchvatal@suse.com --- (In reply to Takashi Iwai from comment #8)
We can add supplements to thermald specifying the CPU IDs that require thermald, too, instead. Adding thermald maintainters to Cc.
BTW, does this happen also with SLE12-SP2 kernel?
Yes for the old kernel; I tried leap first... :)
Technically if we have way how to identify CPU then I am fine for it being supplements. But we also need to change default presets for systemd to enable thermald by default if installed.
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c10
--- Comment #10 from Takashi Iwai tiwai@suse.com --- (In reply to Tomáš Chvátal from comment #9)
(In reply to Takashi Iwai from comment #8)
We can add supplements to thermald specifying the CPU IDs that require thermald, too, instead. Adding thermald maintainters to Cc.
BTW, does this happen also with SLE12-SP2 kernel?
Yes for the old kernel; I tried leap first... :)
Oh, then it's a problem since I'm not sure whether thermald is included in SLE[SD]12-SP2. Thomas?
Technically if we have way how to identify CPU then I am fine for it being supplements. But we also need to change default presets for systemd to enable thermald by default if installed.
That's true. Let's leave the decision to thermald maintainers.
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c11
Tomáš Chvátal tchvatal@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(fvogt@suse.com)
--- Comment #11 from Tomáš Chvátal tchvatal@suse.com --- (In reply to Takashi Iwai from comment #10)
(In reply to Tomáš Chvátal from comment #9)
(In reply to Takashi Iwai from comment #8)
We can add supplements to thermald specifying the CPU IDs that require thermald, too, instead. Adding thermald maintainters to Cc.
BTW, does this happen also with SLE12-SP2 kernel?
Yes for the old kernel; I tried leap first... :)
Oh, then it's a problem since I'm not sure whether thermald is included in SLE[SD]12-SP2. Thomas?
It is not present on the SLES12 atm, only Leap and TW.
Technically if we have way how to identify CPU then I am fine for it being supplements. But we also need to change default presets for systemd to enable thermald by default if installed.
That's true. Let's leave the decision to thermald maintainers.
@thermlad-maints: so what is your take, would you guys add those supplements?
http://bugzilla.suse.com/show_bug.cgi?id=1013563
Jiri Slaby jslaby@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jslaby@suse.com Assignee|kernel-maintainers@forge.pr |fvogt@suse.com |ovo.novell.com |
http://bugzilla.suse.com/show_bug.cgi?id=1013563
Jeffrey Cheung jcheung@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jcheung@suse.com
http://bugzilla.suse.com/show_bug.cgi?id=1013563 http://bugzilla.suse.com/show_bug.cgi?id=1013563#c12
--- Comment #12 from Jeffrey Cheung jcheung@suse.com --- So, what is the final decision of the supplement ?
http://bugzilla.suse.com/show_bug.cgi?id=1013563
Stefan Dirsch sndirsch@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC|mmarek@suse.com |
https://bugzilla.suse.com/show_bug.cgi?id=1013563 https://bugzilla.suse.com/show_bug.cgi?id=1013563#c15
Brice DEKANY brice.dekany@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |brice.dekany@suse.com
--- Comment #15 from Brice DEKANY brice.dekany@suse.com --- Hi,
Looks like lack of thermald by default on intel modern CPU makes SUSE distro less perfomant
https://www.phoronix.com/scan.php?page=article&item=autumn-2021-tigerlak...
https://bugzilla.suse.com/show_bug.cgi?id=1013563 https://bugzilla.suse.com/show_bug.cgi?id=1013563#c16
--- Comment #16 from Takashi Iwai tiwai@suse.com --- This seems to be completely forgotten / overlooked.
Fabian, Thomas, would you mind adding Supplements? e.g. either modalias cpu or PCI for boards.
https://bugzilla.suse.com/show_bug.cgi?id=1013563 https://bugzilla.suse.com/show_bug.cgi?id=1013563#c17
Fabian Vogt fvogt@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(fvogt@suse.com) |needinfo?
--- Comment #17 from Fabian Vogt fvogt@suse.com --- I vaguely remember some report(s) about thermald actually causing the device to overheat and do a hard shutdown. That's not too unlikely, considering its main purpose is to avoid throttling the CPU harder than necessary when the thermal limit is approached/reached.
The thermal limit shouldn't be reached under normal circumstances (like "browsing+ssh" in the original comment), and so thermald shouldn't be absolutely necessary with a properly designed cooling solution. Unfortunately many laptops do not have one, so thermald has a noticable impact on performance on those. However, that comes at a cost: The device will run hotter.
Any opinions about this?
Supplements: (modalias(cpu:type%3Ax86*ven0000*) if kernel)
While there are more specific modaliases which could be used here (like some intel specific ACPI platform devices), those may change in the future and then the supplements would silently not work anymore.
Ideally it only matches physical hardware, but I don't think we have any capability for that (yet).
https://bugzilla.suse.com/show_bug.cgi?id=1013563 https://bugzilla.suse.com/show_bug.cgi?id=1013563#c18
Takashi Iwai tiwai@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo? |
--- Comment #18 from Takashi Iwai tiwai@suse.com --- (In reply to Fabian Vogt from comment #17)
I vaguely remember some report(s) about thermald actually causing the device to overheat and do a hard shutdown. That's not too unlikely, considering its main purpose is to avoid throttling the CPU harder than necessary when the thermal limit is approached/reached.
The thermal limit shouldn't be reached under normal circumstances (like "browsing+ssh" in the original comment), and so thermald shouldn't be absolutely necessary with a properly designed cooling solution. Unfortunately many laptops do not have one, so thermald has a noticable impact on performance on those. However, that comes at a cost: The device will run hotter.
Yes. It's good to give users a choice :)
Any opinions about this?
Supplements: (modalias(cpu:type%3Ax86*ven0000*) if kernel)
I guess this would work as a first shot, that's similar to ucode-intel.
While there are more specific modaliases which could be used here (like some intel specific ACPI platform devices), those may change in the future and then the supplements would silently not work anymore.
Agreed.
Ideally it only matches physical hardware, but I don't think we have any capability for that (yet).
And, thermald won't be enabled / activated by the package installation alone, right?
https://bugzilla.suse.com/show_bug.cgi?id=1013563 https://bugzilla.suse.com/show_bug.cgi?id=1013563#c19
--- Comment #19 from Thomas Renninger trenn@suse.com --- Sorry to highjack this one with another issue: I have this bug lingering around for quite some time.
ucode-amd is recommended on Intel CPU systems https://bugzilla.suse.com/show_bug.cgi?id=1158704
From Kernel:HEAD kernel-firmware: %package -n ucode-amd ... # new style (after 3.12 kernel somewhen) Supplements: modalias(cpu:type%%3Ax86*ven0002*) # old style (before 3.16 kernel) Supplements: modalias(x86cpu:vendor%%3A0002%%3Afamily%%3A*%%3Amodel%%3A*%%3Afeature%%3A*)
any ideas or seeing something obvious ;)
https://bugzilla.suse.com/show_bug.cgi?id=1013563 https://bugzilla.suse.com/show_bug.cgi?id=1013563#c20
--- Comment #20 from Takashi Iwai tiwai@suse.com --- (In reply to Thomas Renninger from comment #19)
Sorry to highjack this one with another issue: I have this bug lingering around for quite some time.
ucode-amd is recommended on Intel CPU systems https://bugzilla.suse.com/show_bug.cgi?id=1158704
From Kernel:HEAD kernel-firmware: %package -n ucode-amd ... # new style (after 3.12 kernel somewhen) Supplements: modalias(cpu:type%%3Ax86*ven0002*) # old style (before 3.16 kernel) Supplements: modalias(x86cpu:vendor%%3A0002%%3Afamily%%3A*%%3Amodel%%3A*%%3Afeature%%3A*)
any ideas or seeing something obvious ;)
The Supplements is fine, but it's pulled just because of Recommends in the patterns-base-enhanced_base (as already suggested in bug 1158704). You may reassign the bug to Component patterns if it's really superfluous.