Sunday, 22 November 2015

Using PR_SET_PDEATHSIG to reap child processes

The prctl() system call provides a rather useful PR_SET_PDEATHSIG option to allow a signal to be sent to child processes when the parent unexpectedly dies. A quick and dirty mechanism is trigger the SIGHUP or SIGKILL signal to kill the child immediately, or perhaps more elegantly to invoke a resource tidy up before exiting.

In the trivial example below, we use the SIGUSR1 signal to inform the child that the parent has died. I know printf() should not be used in a signal handler, it just makes the example simpler.

 #include <stdlib.h>                                 
 #include <unistd.h>                                 
 #include <signal.h>                                 
 #include <sys/prctl.h>                               
 #include <err.h>                                  
                                           
 void sigusr1_handler(int dummy)                           
 {                                          
     printf("Parent died, child now exiting\n");                 
     exit(0);                                  
 }                                          
                                           
 int main()                                     
 {                                          
     pid_t pid;                                 
                                           
     pid = fork();                                
     if (pid < 0)                                
         err(1, "fork failed");                       
     if (pid == 0) {                               
         /* Child */                             
         if (signal(SIGUSR1, sigusr1_handler) == SIG_ERR)          
             err(1, "signal failed");                  
         if (prctl(PR_SET_PDEATHSIG, SIGUSR1) < 0)              
             err(1, "prctl failed");                   
                                           
         for (;;)                              
             sleep(60);                         
     }                                      
     if (pid > 0) {                               
         /* Parent */                            
         sleep(5);                              
         printf("Parent exiting...\n");                   
     }                                      
                                           
     return 0;                                  
 }   

..the child process sits in an infinite loop, performing 60 second sleeps.  The parent sleeps for 5 seconds and then exits.  The child is then sent a SIGUSR1 signal and the handler exits.  In practice the signal handler would be used to trigger a more sophisticated clean up of resources if required.

Anyhow, this is a useful Linux feature that seems to be overlooked.

Thursday, 19 November 2015

Intel Platform Shared Resource Monitoring and Cache Allocation Technology

The Intel Platform Shared Resource Monitoring features were introduced in the Intel Xeon E5v3 processor family. These new features provide a mechanism to measure platform shared resources, such as L3 cache occupancy via Cache Monitoring Technology (CMT) and memory bandwidth utilisation via Memory Bandwidth Monitoring (MBM).

Intel have written a Platform Quality of Service Tool (pqos) to use these monitoring features and I've packaged this up for Ubuntu 16.04 Xenial Xerus.

To install, use:

sudo apt-get install intel-cmt-cat

The tool requires access to the Intel MSRs, so one has to also install the msr module if it is not already loaded:

sudo modprobe msr

To see the Last Level Cache (llc) utilisation on a system, listing the most used first, use:

sudo pqos -T

pqos running on a 48 thread Xeon based server

The -p option allows one to specify specific monitoring events for specific process IDs. Event types can be Last Level Cache (llc), Local Memory Bandwidth (mbl) and Remote Memory Bandwidth (mbr).  For example, on a Xeon E5-2680 I have just Last Level Cache monitoring capability, so lets view the llc for stress-ng while running some VM stressor tests:

sudo pqos -T -p llc:$(pidof stress-ng | tr ' ' ',')

pqos showing equally shared cache between two stressor processes

Cache and Memory Bandwidth monitoring is especially useful to examine the impact of memory/cache hogging processes (such as VM instances).  pqos allows one to identify these processes simply and effectively.

Future Intel Xeon processors will provide capabilities to configure cache resources to specific classes of service using Intel Cache Allocation Technology (CAT).  The pqos tool allows one to modify the CAT settings, however, not having access to a CPU with these capabilities I was unable to experiment with this feature.  I refer you to the pqos manual for more details on this useful feature.  The beauty of CAT is that is allows one to tweak and fine tune the cache allocation for specific demanding use cases.  Given that the cache is a shared resource that can be impacted by badly behaving processes, the ability to tune the cache behaviour is potentially a big performance win.

For more details of these features, see the Intel 64 And IA-32 Architecture Software Development manual, section 17.15 "Platform Share Resource Monitoring: Cache Monitoring Technology" and 17.16 "Platform Shared Resource Control: Cache Allocation Technology".

Wednesday, 11 November 2015

Firmware Test Suite in active development

Another month passes and another release of the Firmware Test Suite is being prepared.  The tool has been growing in functionality (and size!) over time, so I thought I would look at some statistics to see any trends.

There has been a steady growth of the number of authors sending patches to the Firmware Test Suite.  Community contributions to a project is a sign that we have buy-in from different parties, so I'm pleased to see contributions from Intel, Linaro and Redhat.   Patches are always welcome, send them to fwts-devel@ubuntu.com for review and inclusion into the project.

The number of commits is one metric to see if the project is growing healthily. We're adding about 35 patches a month, about 3/4 of which is added functionality, the rest are fixes and general code maintenance.

One more meaningless but interesting metric is code size. I used sloccount to count the lines of C in the project.  We're seeing ~2200 lines of code being added per month, mainly through added test functionality.
Kudos to the Canonical Hardware Enablement firmware folk for wrangling the patches and preparing each FWTS release.