Wednesday, 6 May 2015

powerstat improvements with RAPL

The Linux Running Average Power Limit (RAPL) interface was introduced about 2 years ago in the Linux kernel and allows userspace to read the power consumption from various x86 System-on-a-Chip (SoC) power domains.  The power domains range from the SoC package, CPU core, DRAM controller and graphics power plane.

It appears that the Intel energy status MSRs can be read very rapidly and the resolution is exceptionally good; however, reading the MSR too frequently will consume some power when using the RAPL interface.

I've improved powerstat to now use the RAPL interface with a new -R option (to measure just the total package power consumption).  A new -D option will show all the RAPL domain measurements available.  RAPL measurements are very responsive and one can easily correlate power spikes with bursts of system activity.

Finally, I have added a basic histogram output with the new -H option. This will plot histograms of the power measurements and CPU load from the stats gathered during the powerstat run.

Powerstat 0.01.37 is available in Ubuntu 15.10 Wily Werewolf and the source is available from the git repository.

Monday, 23 February 2015

fnotifystat - a tool to show file system activity

Over the past year or more I was focused on identifying power consuming processes on various mobile devices.  One of many strategies to reducing power is to remove unnecessary file system activity, such as extraneous logging, repeated file writes, unnecessary file re-reads and to reduce metadata updates.

Fnotifystat is a utility I wrote to help identify such file system activity. My desire was to make the tool as small as possible for small embedded devices and to be relatively flexible without the need of using perf just in case the target device did not have perf built into the kernel by default.

By default, fnotifystat will dump out every second any file system open/close/read/write operations across all mounted file systems, however, one can specify the delay in seconds and the number of times to dump out statistics.   fnotifystat uses the fanotify(7) interface to get file activity across the system, hence it needs to be run with CAP_SYS_ADMIN capability.

An open(2), read(2)/write(2) and close(2) sequence by a process can produce multiple events, so fnotifystat has a -m option to merge events and hence reduce the amount of output.  A verbose -v option will output all file events if one desires to see the full system activity.

If one desires to just monitor a specific collection of processes, one can specify a list of the process ID(s) or process names using the -p option, for example:

sudo fnotifystat -p firefox,thunderbird

fnotifystat catch events on all mounted file systems, but one can restrict that by specifying just path(s) one is interested in using the -i (include) option, for example:

sudo fnotifystat -i /proc

..and one can exclude paths using the -x option.

More information and examples can be found on the fnotifystat project page and the manual also contains more details and some examples too.

Fnotifystat 0.01.10 is available in Ubuntu Vivid Vervet 15.04 and can also be installed for older releases from my power management tools PPA.

Tuesday, 27 January 2015

Finding kernel bugs with cppcheck

For the past year I have been running the cppcheck static analyzer against the linux kernel sources to see if it can detect any bugs introduced by new commits. Most of the bugs being found are minor thinkos, null pointer de-referencing, uninitialized variables, memory leaks and mistakes in error handling paths.

A useful feature of cppcheck is the --force option that will check against all the configurations in the source (and the kernel does have many!).  This allows us to check for code that may not be exercised much (because it is normally not built in with most config options) or even find dead code.

The downside of using the --force option is that each source file may need to be checked multiple times for each configuration.  For ~20800 sources files this can take a 24 processor server several hours to process.  Errors and warnings are then compared to previous runs (a delta), making it relatively easy to spot new issues on each run.

We also use the latest sources from the cppcheck git repository.  The upside of this is that new static analysis features are used early and this can result in finding existing bugs that previous versions of cppcheck missed.

A typical cppcheck run against the linux kernel source finds about 600 potential errors and 1700 warnings; however a lot of these are false positives.  These need to be individually eyeballed to sort the wheat from the chaff.

Finally, the data is passed through a gnu plot script to generate a trend graph so I can see how errors (red) and warnings (green) are progressing over time:


..note that the large changes in the graph are mostly with features being enabled (or fixed) in cppcheck.

I have been running the same experiment with smatch too, however I am finding that cppcheck seems to have better code coverage because of the --force option and seems to have less false positives.   As it stands, I am finding that the most productive time for finding issues is around the -rc1 and -rc2 merge times (obviously when most of the the major changes land in the kernel).  The outcome of this work has been a bunch of small fixes landing in the kernel to address bugs that cppcheck has found.

Anyhow, cppcheck is an excellent open source static analyzer for C and C++ that I'd heartily recommend as it does seem to catch useful bugs.

Wednesday, 31 December 2014

Slipping in a few more stress-ng features before 2015

During idle moments in the final few weeks of 2014 I have been adding some more stressors and features to stress-ng as well as tidying up the code and fixing some small bugs that have crept in during the last development spin.   Stress-ng aims to stress a machine with various simple torture tests to trip overheating and kernel race conditions.

The mmap stressor now has the '--mmap-file' to use synchronous file backed memory mapping instead of the default anonymous mapping, and the '--mmap-async' option enables asynchronous file mapping if desired.

For socket stressing, the '--sock-domain unix' option now allows AF_UNIX (aka AF_LOCAL) sockets to be used. This compliments the existing AF_INET and AF_INET6 IPv4 and IPv6 protocols available with this stress test.

The CPU stressor now includes mixed integer and floating point stressors, covering 32 and 64 bit integer mixes with the float, double and long double floating point types. The generated object code contains a nice mix of operations which should exercise various functional units in the CPU.  For example, when running on a hyper-threaded CPU one notices a performance hit because these cpu stressor methods heavily contend on the CPU math functional blocks.

File based locking has been extended with the new lockf stressor, this stresses multiple locking and unlocking on portions of a small file and the default blocking mode can be turned into a CPU consuming rapid polling retry with the '--lockf-nonblock' option.

The dup(2) system call is also now stressed with the new dup stressor. This just repeatedly dup's a file opened on /dev/zero until all the free file slots are full, and then closes these. It is very much like the open stressors.

The fcntl(2) F_SETLEASE command is stress tested with the new lease stressor. This has a parent process that rapidly locks and unlocks a file based lease and 1 or more child processes try to open this file and cause lease breaking signal notifications to the parent.

For x86 CPUs, the cache stressor includes two new cache specific options. The '--cache-fence' option forces write serialization on each store operation, while the '--cache-flush' option forces flush cache on each store operation. The code has been specifically written to not incur any overhead if these options are not enabled or are not available on non-x86 CPUs.

This release also includes the stress-ng project mascot too; a humble CPU being tortured and stressed by a rather angry flame.

For more information, visit the stress-ng project page, or check out the stress-ng manual.