Saturday, 8 June 2019

Working towards stress-ng 0.10.00

Over the past 9+ months I've been cleaning up stress-ng in preparation for a V0.10.00 release.   Stress-ng is a portable Linux/UNIX Swiss army knife of micro-benchmarking kernel stress tests.

The Ubuntu kernel team uses stress-ng for kernel regression testing in several ways:
  • Checking that the kernel does not crash when being stressed tested
  • Performance (bogo-op throughput) regression checks
  • Power consumption regression checks
  • Core CPU Thermal regression checks
The wide range of micro benchmarks in stress-ng allow us to keep track of a range of metrics so that we can catch regressions.

I've tried to focus on several aspects of stress-ng over the last last development cycle:
  • Improve per-stressor modularization. A lot of code has been moved from the core of stress-ng back into each stress test.
  • Clean up a lot of corner case bugs found when we've been testing stress-ng in production.  We exercise stress-ng on a lot of hardware and in various cloud instances, so we find occasional bugs in stress-ng.
  • Improve usability, for example, adding bash command completion.
  • Improve portability (various kernels, compilers and C libraries). It really builds on runs on a *lot* of Linux/UNIX/POSIX systems.
  • Improve kernel test coverage.  Try to exercise more kernel core functionality and reach parts other tests don't yet reach.
Over the past several days I've been running various versions of stress-ng on a gcov enabled 5.0 kernel to measure kernel test coverage with stress-ng.  As shown below, the tool has been slowly gaining more core kernel coverage over time:

With the use of gcov + lcov, I can observe where stress-ng is not currently exercising the kernel and this allows me to devise stress tests to touch these un-exercised parts.  The tool has a history of tripping kernel bugs, so I'm quite pleased it has helped us to find corners of the kernel that needed improving.

This week I released V0.09.59 of stress-ng.  Apart from the usual sets of clean up changes and bug fixes, this new release now incorporates bash command line completion to make it easier to use.  Once the 5.2 Linux kernel has been released and I'm satisfied that stress-ng covers new 5.2 features I will  probably be releasing V0.10.00. This  will be a major release milestone now that stress-ng has realized most of my original design goals.

Saturday, 5 January 2019

Kernel commits with "Fixes" Tag (revisited)

Last year I wrote about kernel commits that are tagged with the "Fixes" tag. Kernel developers use the "Fixes" tag on a bug fix commit to reference an older commit that originally introduced the bug.   The adoption of the tag has been steadily increasing since v3.12 of the kernel:

The red line shows the number of commits per release of the kernel, and the blue line shows the number of commits that contain a "Fixes" tag.

In terms of % of commits that contain the "Fixes" tag, one can see it has been steadily increasing since v3.12 and almost 12.5% of kernel commits in v4.20 are tagged this way.

The fixes tag contains the commit SHA of the commit that was fixed, so one can look up the date of the fix and of the commit being fixed and determine the time taken to fix a bug.

As one can see, a lot of issues get fixed on the first few hundred days, and some bugs take years to get fixed.  Zooming into the first hundred days of fixes the distribution looks like:

..the modal point is at day 4, I suspect these are issues that get found quickly when commits land in linux-next and are found in early testing, integration builds and static analysis.

Out of the thousands of "Fixes" tagged commits and the time to fix an issue one can determine how long it takes to fix a specific percentage of the bugs:

In the graph above, 50% of fixes are made within 151 days of the original commit, ~69% of fixes are made within a year of the original commit and ~82% of fixes are made within 2 years.  The long tail indicates that there are some bugs that take a while to be found and fixed,  the final 10% of bugs take more than 3.5 years to be found and fixed.

Comparing the time to fix issues for kernel versions v4.0, v4.10 and v4.20 for bugs that are fixed in less than 50 days we have:

... the trends are similar, however it is worth noting that more bugs are getting found and fixed a little faster in v4.10 and v4.20 than v4.0.  It will be interesting to see how these trends develop over the next few years.

Wednesday, 12 December 2018

Linux I/O Schedulers

The Linux kernel I/O schedulers attempt to balance the need to get the best possible I/O performance while also trying to ensure the I/O requests are "fairly" shared among the I/O consumers.  There are several I/O schedulers in Linux, each try to solve the I/O scheduling issues using different mechanisms/heuristics and each has their own set of strengths and weaknesses.

For traditional spinning media it makes sense to try and order I/O operations so that they are close together to reduce read/write head movement and hence decrease latency.  However, this reordering means that some I/O requests may get delayed, and the usual solution is to schedule these delayed requests after a specific time.   Faster non-volatile memory devices can generally handle random I/O requests very easily and hence do not require reordering.

Balancing the fairness is also an interesting issue.  A greedy I/O consumer should not block other I/O consumers and there are various heuristics used to determine the fair sharing of I/O.  Generally, the more complex and "fairer" the solution the more compute is required, so selecting a very fair I/O scheduler with a fast I/O device and a slow CPU may not necessarily perform as well as a simpler I/O scheduler.

Finally, the types of I/O patterns on the I/O devices influence the I/O scheduler choice, for example, mixed random read/writes vs mainly sequential reads and occasional random writes.

Because of the mix of requirements, there is no such thing as a perfect all round I/O scheduler.  The defaults being used are chosen to be a good best choice for the general user, however, this may not match everyone's needs.   To clarify the choices, the Ubuntu Kernel Team has provided a Wiki page describing the choices and how to select and tune the various I/O schedulers.  Caveat emptor applies, these are just guidelines and should be used as a starting point to finding the best I/O scheduler for your particular need.

Saturday, 1 December 2018

New features in Forkstat

Forkstat is a simple utility I wrote a while ago that can trace process activity using the rather useful Linux NETLINK_CONNECTOR API.   Recently I have added two extra features that may be of interest:

1.  Improved output using some UTF-8 glyphs.  These are used to show process parent/child relationships and various process events, such as termination, core dumping and renaming.   Use the new -g (glyph) option to enable this mode. For example:

In the above example, the program "wobble" was started and forks off a child process.  The parent then renames itself to wibble (indicated by a turning arrow). The child then segfaults and generates a core dump (indicated by a skull and crossbones), triggering apport to investigate the crash.  After this, we observe NetworkManager creating a thread that runs for a very short period of time.   This kind of activity is normally impossible to spot while running conventions tools such as ps or top.

2. By default, forkstat will show the process name using the contents of /proc/$PID/cmdline.  The new -c option allows one to instead use the 16 character task "comm" field, and this can be helpful for spotting process name changes on PROC_EVENT_COMM events.

These are small changes, but I think they make forkstat more useful.  The updated forkstat will be available in Ubuntu 19.04 "Disco Dingo".