Saturday 27 June 2015

Static code analysis on kernel source

Since 2014 I have been running static code analysis using tools such as cppcheck and smatch against the Linux kernel source on a regular basis to catch bugs that creep into the kernel.   After each cppcheck run I then diff the logs and get a list of deltas on the error and warning messages, and I periodically review these to filter out false positives and I end up with a list of bugs that need some attention.

Bugs such as allocations returning NULL pointers without checks, memory leaks, duplicate memory frees and uninitialized variables are easy to find with static analyzers and generally just require generally one or two line fixes.

So what are the overall trends like?

Warnings and error messages from cppcheck have been dropping over time and "portable warnings" have been steadily increasing.  "Portable warnings" are mainly from arithmetic on void * pointers (which GCC handles has byte sized but is not legal C), and these are slowly increasing over time.   Note that there is some variation in the results as I use the latest versions of cppcheck, and occasionally it finds a lot of false positives and then this gets fixed in later versions of cppcheck.

Comparing it to the growth in kernel size the drop overall warning and error message trends from cppcheck aren't so bad considering the kernel has grown by nearly 11% over the time I have been running the static analysis.

Kernel source growth over time
Since each warning or error reported has to be carefully scrutinized to determine if they are false positives (and this takes a lot of effort and time), I've not yet been able determine the exact false positive rates on these stats.  Compared to the actual lines of code, cppcheck is finding ~1 error per 15K lines of source.

It would be interesting to run this analysis on commercial static analyzers such as Coverity and see how the stats compare.  As it stands, cppcheck is doing it's bit in detecting errors and helping engineers to improve code quality.

6 comments:

  1. I read many blogs on static code analysis and in every blog I found that every developer fight against lot of false positives. Sometime it distract developer due to this distraction developer ignore important warnings. Still static code analysis tool help developers and plays vital role in programming.

    ReplyDelete
  2. Appreciate this blog. This blog provide complete information about use and importance of automated static analysis tool. Thanks for sharing

    ReplyDelete
  3. Wow.. Nice informative blog... I use Source code review tools and after reading the blog I came to know how Google engineers rely on mail and textual diffs when doing code reviews.

    ReplyDelete
  4. The best article I ever found on the static code analysis tools comparison. very detailed and exact information given. Time taken for this article is highly appreciable.

    ReplyDelete
  5. Nice blog... it is crucial to additionally use tools of tools for static code review and to make software more dependable and reliable.

    ReplyDelete
  6. Good one... According to me static analysis done by the examine the code without executing. This blog nicely explain use of tools for static code review. Thanks for sharing

    ReplyDelete