The
Intel Thermal deamon (aka thermald) actively monitors thermal sensors and will modify cooling controls to try to keep the hardware cool. By default, thermald will run in a "zero-configuration" mode and attempt to use the available CPU Digital Thermal Sensor(s) (DTS) to sense the temperature and use the
P-state driver,
Running Average Power Limit (RAPL),
PowerClamp and
cpufreq to control cooling.
Some systems may not work well in the default mode, perhaps the machine just runs too hot and one would like to tweak the settings to kick in passive or active cooling at a lower temperature than the default configuration. Thermald has a configuration file
/etc/thermald/thermal-conf.xml that allows fine tuning of thermald. Essentially one declares the thermal sensors on the machine and a set of thermal zone controls that read these thermal sensors and inform thermald the policy to control cooling when specific temperature thresholds are crossed.
For an example, I've picked on an old Acer Aspire One (AMD C-60). Let's see the sensors for this machine:
find /sys/class/hwmon/* -exec echo -n "{}: " \; -exec cat {}/name \;
/sys/class/hwmon/hwmon0: radeon
/sys/class/hwmon/hwmon1: k10temp
one can use tools such as sensors (from the lm-sensors package) to get an idea of the high and critical trip points for these:
$ sudo apt-get install lm-sensors
$ sensors
radeon-pci-0008
Adapter: PCI adapter
temp1: +60.0°C (crit = +120.0°C, hyst = +90.0°C)
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +60.5°C (high = +70.0°C)
(crit = +115.0°C, hyst = +107.5°C)
So, in this simple example, I will just use the CPU sensor k10temp (from /sys/class/hwmon/hwmon1) as my thermald CPU temperature sensor.
Next, I need to define a policy on what to do when this sensor reaches a specific high temperature threshold. In this example, I want to trigger passive (non-fan) cooling by adjusting the CPU frequency using cpufreq and also the ACPI processor sysfs cooling controls when we reach 85 degrees C. I require thermald to control both cooling methods to run together in parallel with 60% of the influence to come from cpufreq and 40% from the ACPI processor cooling controls.
My thermald config file for this is as follows:
<ThermalConfiguration>
<Platform>
<Name>Aspire One</Name>
<ProductName>*</ProductName>
<Preference>QUIET</Preference>
<ThermalSensors>
<ThermalSensor>
<Type>CPU_TEMP</Type>
<Path>/sys/class/hwmon/hwmon0/temp1_input</Path>
<AsyncCapable>0</AsyncCapable>
</ThermalSensor>
</ThermalSensors>
<ThermalZones>
<ThermalZone>
<Type>cpu package</Type>
<TripPoints>
<TripPoint>
<SensorType>CPU_TEMP</SensorType>
<Temperature>90000</Temperature>
<type>passive</type>
<ControlType>PARALLEL</ControlType>
<CoolingDevice>
<index>1</index>
<type>cpufreq</type>
<influence>60</influence>
<SamplingPeriod>1</SamplingPeriod>
</CoolingDevice>
<CoolingDevice>
<index>2</index>
<type>Processor</type>
<influence>40</influence>
<SamplingPeriod>1</SamplingPeriod>
</CoolingDevice>
</TripPoint>
</TripPoints>
</ThermalZone>
</ThermalZones>
</Platform>
</ThermalConfiguration>
One can observe this working by starting thermald in verbose debug mode:
$ sudo thermald --no-daemon --loglevel=debug
it is worth exercising the machine (I use stress-ng --cpu 0) to ramp up the load and temperature to observe how thermald is working. Once one is happy with the results, one can then start thermald using:
$ sudo systemctl start thermald
More examples can be found in the thermald manual page:
$ man thermal-conf.xml