Friday 25 September 2020

Kernel janitor work: fixing spelling mistakes in kernel messages

The Linux 5.9-rc6 kernel source contains over 300,000 literal strings used in kernel messages of various sorts (errors, warnings, etc) and it is no surprise that typos and spelling mistakes slip into these messages from time to time.

To catch spelling mistakes I run a daily automated job that fetches the tip from linux-next and runs a fast spelling checker tool that finds all spelling mistakes and then diff's these against the results from the previous day.  The diff is emailed to me and I put my kernel janitor hat on, fix these up and send these to the upstream developers and maintainers.

The spelling checker tool is a fast-and-dirty C parser that finds literal strings and also variable names and checks these against a US English dictionary containing over 100,000 words. As fun weekend side project I hand optimized the checker to be able to parse and spell check several millions lines of kernel C code per second.

Every 3 or so months I collate all the fixes I've made and where appropriate I add new spelling mistake patterns to the kernel checkpatch spelling dictionary.   Kernel developers should in practice run on their patches before submitting them upstream and hopefully the dictionary will catch a lot of the regular spelling mistakes.

Over the past couple of years I've seen less spelling mistakes creep into the kernel, either because folk are running checkpatch more nowadays and/or that the dictionary is now able to catch more spelling mistakes.  As it stands, this is good as it means less work to fix these up.

Spelling mistakes may be trivial fixes, but cleaning these up helps make the kernel errors appear more professional and can also help clear up some ambiguous messages.