We need a CPU Thermal Ladder
You mix something up there. This has nothing to do with the cpuidle ladder
governor. What you want to have is a passive trip point that can be set if the
OEM/BIOS does not provide one.
Quick introduction to passive cooling:
Via thermal ACPI tables there are active (can be more than one), passive, hot,
critical trip points which define a temperature.
active -> different fans or fan states are switched on
https://bugzilla.novell.com/show_bug.cgi?id=557586
https://bugzilla.novell.com/show_bug.cgi?id=557586#c38
--- Comment #38 from Thomas Renninger 2010-12-21 21:59:55 UTC ---
Puhh, quite some info mixed together. I won't go into too much detail, but try
to answer or explain some topics you mention:
GPU
---
Interesting topic. GPU is the second biggest power waster. Depending on your
graphics driver and HW your GPU might do some powersaving or not. Especially
since on-board ATI graphics these can consume quite some energy and
unfortunately the latest open source KMS driver may do some power savings, but
it might be worth to double check with fglrx which may be able to do better.
CPU
---
Throttling
.........
A technique which is much less efficient compared to cpufreq. Intel uses it in
worst case to avoid thermal shutdowns. You never want to have this enabled if
you have cpufreq scaling (aka Powernow!).
Powernow!
........
Should always be enabled. Yes there were issues. But there were significant
improvements over the past years. Namely:
- first userspace governor was used, checking cpu load every 333ms
- then ondemand (kernel) governor was used, but on AMD systems
checking of cpu load went up to about 1.2 seconds in worst case.
- With latest kernel on quad cores, every 10ms cpu load should get checked
in kernel.
- Very latest kernel count IO wait time as CPU load time or "CPU is utilized
time" (this should be available on 11.4), it may give you a bit improvement
on heavy disk work.
Performance loss is nearly zero. I doubt you find a workload to prove more than
2% of performance loss even if you try really hard to scale the CPU utilization
up and down all the time.
Anyway, if possible, please switch powernow-k8 on.
passive -> First try to limit cpufreq, if not available try throttling
critical -> shut the machine down
These are defined via ACPI and there were bugs, probably still are in BIOS or
in kernel.
Which trip points your BIOS exports to OS can be found here (deprecated in
11.4):
cat /proc/acpi/thermal_zone/THRM/trip_points
in recent kernels you have to gather this info from sysfs:
/sys/devices/virtual/thermal
I remember one bug I've seen in several BIOSes which supported dual core AMD
CPUs and then were enhanced to support socket compatible quad core AMD cpus:
A passive trip point is connected to a CPU. But the CPU object's ACPI name got
renamed, but they forgot to change it in the passive trip point definition.
I submitted a workaround to assign all CPUs (should always be intended) to this
passive trip point if there is an error to reference the (wrong name/not
existing) CPU.
This is a wild guess. I also expect you have different issues as this is
platform/BIOS specific and there are several people in CC of this bug.
Please try to gather some more info. Enable PowerNow! and ACPI. If you still
have issues, monitor (ACPI) temperature (exported in the paths I point to
above), look at trip points, etc.
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.