Saturday, 30 July 2016

Scanning the Linux kernel for error messages

The Linux kernel contains lots of error/warning/information messages; over 130,000 in the current 4.7 kernel.  One of the tests in the Firmware Test Suite (FWTS) is to find BIOS/ACPI/UEFI related kernel error messages in the kernel log and try to provide some helpful advice on each error message since some can be very cryptic to the untrained eye.

The FWTS kernel error log database is currently approaching 800 entries and I have been slowly working through another 800 or so more relevant and recently added messages.  Needless to say, this is taking a while to complete.  The hardest part was finding relevant error messages in the kernel as they appear in different forms (e.g. printk(), dev_err(), ACPI_ERROR() etc).

In order to scrape the Linux kernel source for relevant error messages I hacked up the kernelscan parser to find error messages and dump these to stdout.  kernelscan can scan 43,000 source files (17,900,000 lines of source) in under 10 seconds on my Lenovo X230 laptop, so it is relatively fast.

I also have been using kernelscan to find spelling mistakes in kernel messages and I've been punting trivial fixes upstream to fix these.  These mistakes are small and petty, but I find it a little irksome when I see the kernel emit a message that contains a typo or spelling mistake - it just looks a bit unprofessional.

I've created a kernelscan snap (which was really easy and fast to do using scancraft), so it is now available Ubuntu.  The source code is also available from the kernel team git web at http://kernel.ubuntu.com/git/cking/kernelscan.git/

The code is designed to only parse kernel source, and it is a very rough and ready parser designed for speed;  fundamentally, it is a big quick hack.  When I get a few spare minutes I will try and see if there is any correlation between the number of error messages with the size of the kernel over the various releases.