Friday, 25 September 2020

Kernel janitor work: fixing spelling mistakes in kernel messages

The Linux 5.9-rc6 kernel source contains over 300,000 literal strings used in kernel messages of various sorts (errors, warnings, etc) and it is no surprise that typos and spelling mistakes slip into these messages from time to time.

To catch spelling mistakes I run a daily automated job that fetches the tip from linux-next and runs a fast spelling checker tool that finds all spelling mistakes and then diff's these against the results from the previous day.  The diff is emailed to me and I put my kernel janitor hat on, fix these up and send these to the upstream developers and maintainers.

The spelling checker tool is a fast-and-dirty C parser that finds literal strings and also variable names and checks these against a US English dictionary containing over 100,000 words. As fun weekend side project I hand optimized the checker to be able to parse and spell check several millions lines of kernel C code per second.

Every 3 or so months I collate all the fixes I've made and where appropriate I add new spelling mistake patterns to the kernel checkpatch spelling dictionary.   Kernel developers should in practice run checkpatch.pl on their patches before submitting them upstream and hopefully the dictionary will catch a lot of the regular spelling mistakes.

Over the past couple of years I've seen less spelling mistakes creep into the kernel, either because folk are running checkpatch more nowadays and/or that the dictionary is now able to catch more spelling mistakes.  As it stands, this is good as it means less work to fix these up.

Spelling mistakes may be trivial fixes, but cleaning these up helps make the kernel errors appear more professional and can also help clear up some ambiguous messages.

Thursday, 30 April 2020

easy capturing of kernel stack traces with virsh

Today I needed to capture a rather large kernel stack dump, this is rather trivial using virsh.  Using virt-manager I created a VM named vm-focal and in the guest ran:

sudo systemctl enable serial-getty@ttyS0.service 

Then on the host running the VM I ran:

virsh console vm-focal

Then all I needed to do was produce the stack dump and the console output was successfully dumped by virsh. Easy.

Thursday, 17 October 2019

Stress testing CPU temperatures

Stress testing CPU temperatures is not exactly straight forward.  CPU designs vary from CPU to CPU and each have their own strengths and weaknesses in cache design, integer maths, floating point math, bit-wise logical operations and branch prediction to name but a few.  I've been asked several times about the "best" CPU stressor method in stress-ng to use to make a CPU run hot.

As an experiment I ran all the CPU stressor methods in stress-ng for 60 seconds across a range of devices, from small ARM based Raspberry Pi 2 and 3 to much larger Xeon desktop servers just to see how hot CPUs get. The thermal measurements were based on the most relevant thermal zones, for example, on x86 this is the CPU package thermal zone.  In between each stress run 60 seconds of idle time was added to allow the CPU to cool.

Below are the results:

As one can see, quite a mixed set of results and it is hard to recommend any specific CPU stressor method as the "best" across a range of CPUs.  It does appear that the mix of 64 bit integer and floating point cpu stress methods do seem to be generally rather good for making most CPUs run hot.

With this is mind, I think we can conclude there is no such thing as a perfect way to make a CPU run hot as it is very architecture dependant.  Fortunately the stress-ng CPU stressor has a large suite of methods to exercise the CPU in different ways, so there should be a good stressor somewhere in that collection to max out your CPU.  Knowing which one is the tricky part(!)